Alessandro Sangiorgi DevConf.US 2025

Alessandro Sangiorgi
.ical

Alessandro Sangiorgi is a Software Engineer in the Emerging Technologies Group within the Office of the CTO at Red Hat. He has extensive experience across Cloud, Distributed Systems, AI, and Networking products and technologies.

Job title –

Software Engineer

Company or affiliation –

Red hat

Session

09-19

13:40

35min

From Cold Start to Warp Speed: Triton Kernel Caching with OCI Container images

Maryam Tahhan, Alessandro Sangiorgi

Model startup latency is a persistent bottleneck for modern inference workloads, particularly when using custom kernels written in Triton that are Just In Time (JIT) compiled. In this talk, we’ll present a novel approach to speeding up model boot times by wrapping Triton kernel caches in OCI container images.
We’ll demo a working prototype that packages Triton-generated LLVM Kernels into reusable, portable container layers. These "hot start" containers can be deployed directly to Kubernetes, bypassing costly JIT compilation and significantly reducing model startup time.
Whether you're building ML infrastructure, working with OSS compilers, or deploying models at scale, this talk offers practical techniques to optimise cold starts for Models using Triton-lang.

Artificial Intelligence and Data Science

Ladd Room (Capacity 96)

Alessandro Sangiorgi .ical

Session

Alessandro Sangiorgi
.ical