2026-06-19 –, D0207 (capacity 90)
GPU costs are spiraling, yet clusters waste 30–40% of capacity due to static allocation. A GPU assigned to a pod sits idle between inference calls, model loading, startup and nobody else can use it. It gets worse when VM-based and containerized workloads run on separate clusters. The pool is siloed. No sharing, no reclaim, just waste.
This talk fixes that at the scheduling layer using two upstream Kubernetes projects i.e KubeVirt, which brings VM workloads under native Kubernetes scheduling, and Dynamic Resource Allocation (DRA), which replaces the rigid device plugin model with a flexible, claim-based API. Together they enable GPU sharing across VMs and containers on a single cluster.
We'll walk through real scheduling data, the DRA resource claim model, and how KubeVirt VM lifecycle integrates with DRA's structured parameter API. No theory-heavy slides. just the problem, the architecture, and what works.
Basavaraju G is a Product Owner for Redhat OpenShift Add-Ons on IBM Z & LinuxONE at IBM Labs, and a Red Hat Partner Engineer, driving cloud-native parity across OpenShift, OpenShift Virtualization, Pipelines, and Red Hat Quay. A researcher and open source contributor, he holds 2 patents and has authored 3 IEEE publications in ML and containers. He actively contributes to Tekton and KubeVirt — CNCF incubating projects — extending cloud-native capabilities to s390x architecture.