BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.devconf.info//devconf-us-2025//talk//YXUZEB
BEGIN:VTIMEZONE
TZID:EST
BEGIN:STANDARD
DTSTART:20001029T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10;UNTIL=20061029T070000Z
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
END:STANDARD
BEGIN:STANDARD
DTSTART:20071104T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000402T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=4;UNTIL=20060402T080000Z
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
END:DAYLIGHT
BEGIN:DAYLIGHT
DTSTART:20070311T030000
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-devconf-us-2025-YXUZEB@pretalx.devconf.info
DTSTART;TZID=EST:20250919T134000
DTEND;TZID=EST:20250919T141500
DESCRIPTION:Model startup latency is a persistent bottleneck for modern inf
 erence workloads\, particularly when using custom kernels written in Trito
 n that are Just In Time (JIT) compiled. In this talk\, we’ll present a n
 ovel approach to speeding up model boot times by wrapping Triton kernel ca
 ches in OCI container images.\nWe’ll demo a working prototype that packa
 ges Triton-generated LLVM Kernels into reusable\, portable container layer
 s. These "hot start" containers can be deployed directly to Kubernetes\, b
 ypassing costly JIT compilation and significantly reducing model startup t
 ime.\nWhether you're building ML infrastructure\, working with OSS compile
 rs\, or deploying models at scale\, this talk offers practical techniques 
 to optimise cold starts for Models using Triton-lang.
DTSTAMP:20260315T081500Z
LOCATION:Ladd Room (Capacity 170)
SUMMARY:From Cold Start to Warp Speed: Triton Kernel Caching with OCI Conta
 iner images - Maryam Tahhan\, Alessandro Sangiorgi
URL:https://pretalx.devconf.info/devconf-us-2025/talk/YXUZEB/
END:VEVENT
END:VCALENDAR
