Roberto Carratalá DevConf.US 2025

Roberto Carratalá
.ical

Roberto is a Principal AI Architect working in the AI Business Unit specializing in Container Orchestration Platforms (OpenShift & Kubernetes), AI/ML, DevSecOps, and CI/CD. With over 10 years of experience in system administration, cloud infrastructure, and AI/ML, he holds two MSc degrees in Telco Engineering and AI/ML.

Job title –

Principal AI Platform Architect

Company or affiliation –

Red Hat

Session

09-20

11:00

35min

Cloud-Native Model Serving: vLLM's Lifecycle in Kubernetes

Cedric Clyburn, Roberto Carratalá

Effectively deploying Large Language Models (LLMs) in Kubernetes is critical for modern AI workloads, and vLLM has emerged as a leading open-source project for LLM inference serving. This session will explore the unique features of vLLM, which set it apart by maximizing throughput and minimizing resource usage. We’ll explore the lifecycle of deploying AI/LLM workloads on Kubernetes, focusing on achieving seamless containerization, efficient scaling with Kubernetes-native tools, and robust monitoring to ensure reliable operations.

By simplifying complex workloads and optimizing performance, vLLM drives innovation in scalable and efficient LLM deployment by leveraging features like dynamic batching and distributed serving, making advanced inference accessible for diverse and demanding use cases. Join us to learn why vLLM is shaping the future of LLM serving and how it integrates into Kubernetes to deliver reliable, cost-effective, and high-performance AI systems.

Cloud, Hybrid Cloud, and Hyperscale Infrastructure

101 (Capacity 48)

Roberto Carratalá .ical

Session

Roberto Carratalá
.ical