DevConf.US 2025

llm-d: Kubernetes Native Distributed Inferencing
2025-09-19 , Ladd Room (Capacity 170)

llm-d is a well-lit path for anyone to serve LLMs at scale, for any model across a diverse and comprehensive set of hardware accelerators. Come learn more about how llm-d enables distributed inference at scale!


What level of experience should the audience have to best understand your session?

Beginner - no experience needed

See also:

Robert is a director of engineering at Red Hat. Before joining Red Hat, Robert was senior director of engineering at Neural Magic. He is a core committer to vLLM and a maintainer of lllm-d.