Meet and network with the following communities on the show floor!
- Red Hat
- Foreman, Katello, and Pulp Community Booth
- Podman
- RISC-V
- Ubuntu
- UXD
Join us for the conference opening with the organizers!
TBD
Efficient data ingestion is foundational to modern AI-driven applications, yet developers face significant challenges: unstructured data, sensitive information management, and rising costs from excessive model fine-tuning. Fortunately, cloud-native Java runtimes like Quarkus simplify this process by seamlessly bridging data ingestion and AI workflows, primarily through Retrieval-Augmented Generation (RAG). In this hands-on technical workshop tailored for developers and AI engineers, we'll explore how Quarkus empowers teams to ingest, structure, and query data, making institutional knowledge instantly available to large language model (LLM) consumers.
Participants will:
* Structure Unstructured Data: Learn to extract actionable insights from PDFs, proprietary formats, and unstructured documents using the open-source Docling project, preparing your data for seamless AI integration.
* Deploy and Utilize RAG Effectively: Understand how RAG enables real-time retrieval and enhances generative responses without extensive fine-tuning. We’ll also cover targeted fine-tuning with InstructLab for specialized, domain-specific knowledge.
* We'll culminate the workshop by constructing a practical, privacy-conscious application: a searchable, AI-powered ticketing solution inspired by systems like ServiceNow.
Join us and discover how easily Quarkus and RAG can transform your raw data into secure, powerful, and instantly accessible business insights.
As a developer or user experience (UX) practitioner, you know that early and frequent feedback from users is a necessary part of ensuring your project’s success. However, finding users with relevant experience to participate in UX research activities is often a barrier - especially for highly technical enterprise applications. That gives UXers working in open source a big advantage. By their very nature, open source projects include communities of users who are experienced with and passionate about the projects they contribute to. Community members and contributors have a vested interest in improving their project’s ease of use, making them ideal and enthusiastic user research participants.
We’ll talk about:
- How our UX team engaged with the upstream Kubeflow community for an AI-related research project
- Challenges we encountered while conducting research with community members
- Lessons we learned from our experience
Large language models learn to predict human and machine text as sequences of “tokens.” But what are these tokens, and how are they used to represent text? The answers to these questions matter: they form the foundation of how every LLM generates its output, and how its output correctness trades off against compute performance.
In this talk Erik Erlandson will explore a variety of algorithms used to tokenize text before it's processed by these models, focusing on their trade-offs and impact on model performance. He’ll compare algorithms for word-based, subword-based, and character-level tokenization, including widespread approaches such as Byte Pair Encoding and WordPiece.
Attendees will gain an understanding of how LLMs depend on tokenization and how choices of tokenization impact model performance tradeoffs.
This talk for beginners will offer a high-level view of how public-key cryptography works, why quantum computing threatens existing public-key cryptography algorithms, and what needs to be done to harden cryptography against quantum attacks.
Retrieval-Augmented Generation (RAG) workloads are powerful—but deploying them can be complex. In this session, I’ll walk through how we built a golden path for AI chatbot deployment using Red Hat Developer Hub on OpenShift, GitOps workflows, and Argo CD. With a single template, we automate GitOps repository scaffolding, Argo CD manifests, and Helm chart deployments—integrating AI model inference, GPU support, and pipeline-ready structure from the start.
You’ll see how we combined DevOps best practices with developer experience tooling to make AI application onboarding faster, more consistent, and fully auditable.
Ideal for developers, admins, and DevOps engineers, this session shows how Red Hat Developer Hub can serve as the foundation for automating complex RAG deployments within a GitOps framework. Whether you’re supporting LLM projects or scaling internal platforms, you’ll leave with practical steps to go from zero to production-ready—confidently and repeatably.
The project aims to assess the feasibility and effectiveness of an AI-enabled chatbot for mental health detection, employing Large Language Models (LLM), Natural Language Processing (NLP), and Deep Learning models. The web application integrates social attributes to aid users with mental health concerns, offering self-assistance through personalized assessments. The core strategy centers on fostering an "Optimistic Presence" by deploying an AI-driven virtual assistant capable of empathic conversations, active listening, and emotional state analysis. The methodology involves emulating human mental health professionals, assessing conditions through various cues, and offering tailored therapeutic interventions for any stressed out individuals. Integration with health records using Azure PostgreSQL allows collaboration with human providers for comprehensive care. This innovative solution seeks to extend constant virtual AI therapy, revolutionizing mental health support with technology-driven personalized assistance for students, working professionals and many hidden victims of poor mental health.
BOF to discuss all things containers.
Thanks to AI, designers and PMs are now prototyping and deploying faster than ever—sometimes without writing a single line of code. Tools like Figma AI and Copilot are blurring the lines between design and implementation. So where does that leave frontend engineers?
In a world where AI can generate flawless UIs, auto-wire flows, and ship entire components from a single prompt, it's tempting to declare the death of frontend engineering.
But what if we're just looking at it wrong?
The pixel-pushing days may be over, but the complexity is just shifting layers. Welcome to the age of the back of the frontend. This talk explores how AI is fragmenting the frontend into fast, flashy outputs and deeply nuanced logic, and why the real value now lies beneath the surface.
We’ll break down what’s actually being automated (spoiler: it’s not the hard stuff)––the logic, flows, edge cases, performance - that’s still on us, where human acuity still matters, and why knowing when not to use AI might be your new superpower.
This talk doesn’t shy away from the uncomfortable question: what is your job when everyone can ‘build’?
If you're a frontend dev wondering whether you still have a job—you do. But it may not look the same as the one you trained for.
Why do so many LLM-based AI agents break down in real-world settings? Despite their increasing popularity, many are fragile, hard to scale, and tightly coupled to specific toolchains. We set out to build a modular, cloud-native agent stack using open source components—and one component that stood out was the Model Context Protocol (MCP).
MCP is an open standard that enables AI assistants and agents to seamlessly connect with real data sources—content repositories, business tools, development environments, and more. Think of it as a USB-C port for AI applications—rather than building custom integrations for every tool, MCP simplifies connectivity, authentication, and data flow, ensuring seamless interoperability between AI models and external systems.
In this talk, we’ll walk through how we integrated MCP into an open AI agent architecture leveraging:
1) vLLM for efficient model inference
2) Llama Stack as the open source agent framework
3) MCP to handle tool invocation and data flow
4) Kubernetes for scalable, cloud-native deployment
We’ll walk through the architecture, demo the system in action, and share lessons learned along the way. You’ll gain a solid understanding of how MCP works, its role in the AI ecosystem, and whether it’s just hype or a game-changer. Whether you're an AI researcher, open source contributor, developer, or architect, you'll walk away with practical insights on using MCP to build more dynamic and efficient AI applications.
Competition for memory bandwidth and CPU caches between containers can increase application response times by 5x to 14x, even with CPU and memory limits in place. It can be triggered by common events like garbage collection, and existing observability tools do not collect the metrics to detect it. As it manifests as latency SLO violations, operators often scale out and run at low utilization: expensive, and only marginally improves response times.
CPU performance counters can detect memory interference. However since interference events are frequent and short-lived, detecting them requires high-frequency measurements, which is challenging due to jitter and overhead.
This session first presents the causes of memory noisy neighbors, real-world patterns that trigger it, and the benefits of mitigation. We then show how a new open source collector combines CPU performance counters, eBPF, and high-resolution timers to identify noisy neighbors in Kubernetes.
Users of radio-controlled clocks have discovered that the longwave radio signals that set the time on these devices can be replicated using only a laptop or smartphone and a set of wired headphones or speakers. We'll talk about how this works, why people do it, and a practical example in the C language.
Do you know how, as a Software Engineer/ Quality Engineer, one can ensure accessibility is maintained throughout the development process? Are you interested in knowing how we can automate the accessibility checks and incorporate them into the unit/regression suite?
Join me as I explore practical approaches to integrate accessibility testing into your CI/CD process there by detecting potential accessibility issues early in the software development process along with real time examples and code slinging demos.
Developers, Quality Engineers, and Designers let's start building a more inclusive digital world.
"We want more user research!" "We have 0 time to do it!" "We don't even have mockups yet!" Do these phrases sound familiar when it comes to to getting user feedback? In this talk, Sr. Interaction Designer Mary Shakshober will give a deep dive on a user research method she tried out recently to uncover user mental models and existing perspectives on terminology used within technical graphical user interfaces (GUIs). Learn how to facilitate discovery research that takes just a couple hours to plan, requires no mockups, only takes about 20 minutes to facilitate with users, and most importantly, produces valuable insights for product design.
Kubernetes is built to scale—but how do you prove it? Whether optimizing for cost or deploying at hyper-scale, benchmarking is crucial to understanding where your clusters thrive and where they break. Enter Kube-burner and Cluster Loader 2, two powerful tools designed to push Kubernetes to its limits and expose performance bottlenecks before your users do.
But with great power comes complexity. How do you design meaningful benchmarks? What metrics truly matter? And how do you simulate real-world workloads without misleading results? This talk explores how Kube-burner and Cluster Loader 2 generate large-scale workloads, measure stress-induced performance, and uncover cluster-tuning insights. You'll learn best practices, pitfalls, and how to turn raw numbers into optimizations. Whether you're an SRE, platform engineer, or Kubernetes enthusiast, this session will help you set baselines, break limits, and push Kubernetes to its true potential.
Generating high-quality, domain-specific data for large language models (LLMs) is a significant challenge, particularly when creating datasets relevant for model customization and fine-tuning. In this session, the audience will learn how synthetic data generation techniques can address this challenge. The speaker will cover how third-party teacher models like Mixtral, Mistral, Phi-4, and LLaMA streamline the process, along with open-source tools like Docling, which breaks down complex knowledge documents into semantic chunks.
Additionally, the session will walk the audience through how they can bring their own teacher models into the SDG workflow, enabling the creation of higher-quality samples for model customization. Attendees will also learn how to build modular, flexible workflows without any coding skills, making it easy to scale data generation tasks.
The session will demonstrate how to efficiently process large volumes of knowledge data and generate high-quality samples using an open-source, cost-effective approach to building production-ready LLMs—without relying on extensive manual annotation.
Think UX is just for GUIs and doesn't apply to your APIs, data structures, or backend services because there's "no UI"? This talk debunks that myth, revealing how mapping user journeys and carefully structuring data within these 'unseen' components creates more intuitive systems for both human developers and increasingly sophisticated AI agents. Discover how focusing on the user's path can make your API and other non-UI work truly shine, pixels or not!
Have you ever wondered how web pages are seamlessly redirected to new locations? Have you ever noticed how a broken link carefully redirected to a working one?
Application migration projects frequently encounter complexities in managing redirects and ensuring their correct functionality through rigorus testing.
Join me as we explore the details of Server side and Client side redirects, uncovering their mechanics, use cases, and testing strategies with real time examples and demos.
Key takeaways for audience:
1. The basic ideas of HTTP status codes and redirect types will be understood.
2. How to automate redirect testing and what useful solutions are available on the market for this purpose.
3. We will observe best practices and real-world examples.
Large Language Models (LLMs) are increasingly used in real-world applications, but continually adapting them to new tasks without catastrophic forgetting remains a major challenge. In this talk, I introduce a novel full-parameter continual learning method that leverages adaptive Singular Value Decomposition (SVD) to dynamically identify and protect task-critical subspaces. By constraining updates to orthogonal low-rank directions, our method enables models to retain previous knowledge without adding extra parameters. Compared to strong baselines like O-LoRA, our approach achieves up to 7% higher accuracy while maintaining general language capabilities and safety. This talk will present the methodology, theoretical foundations, and extensive empirical results, demonstrating a scalable path toward continually evolving LLMs.
Today’s rapidly moving frontend landscape often requires developers to face the challenge of maintaining consistent UI components across multiple frameworks React, Vue, Svelte, Angular, and beyond. This talk dives into the concept for packaging and delivering reusable UI pieces that transcend framework boundaries.
Whether you're working in a mono-repo or orchestrating micro frontends, we'll explore how to design, build, and deliver components that can be consumed across frameworks with minimal friction without sacrificing performance, dev experience, or maintainability.
Congratulations,
You’ve successfully deployed and tightly configured the most secure Kubernetes platform on the planet.
Now, the question is: How can your team achieve a secure posture for the container workloads themselves?
This talk provides a practical guide at using open source Stackrox ( Red Hat ACS) to secure containers throughout their lifecycle: Build, Deploy and Runtime. We will focus on security policies and dive into specific controls.
By the end of this session you will understand:
- The 10 essential security risk areas throughout the container's lifecycle
- How to mitigate each risk using Stackrox
- The kind of controls that can be enforced through Stackrox’s security policies
- The distinctions between Stackrox policies for Build, Deploy and Runtime phases, and why we use Deploy type policies during the Build phase
- The enforcement and remediation actions you can take at each step
Are you curious about the Agile mindset and its popular frameworks? If you're looking for an engaging way to deepen your understanding of these concepts, you're in the right place. Learning through play can be a powerful tool, and there's an exciting opportunity to explore Agile principles in a hands-on way. Let's dive into a unique workshop designed to enhance your grasp of Agile methodologies through interactive experiences.
During this workshop, we will explore:
- Self-organization
- "Inspect & Adapt" approach to Continuous Iteration/Improvement
- Collaboration
- Inclusion
Embracing the Agile mindset through interactive learning not only makes the concepts more relatable but also empowers you to apply them effectively in real-world scenarios. So, gear up for an enriching experience that promises to transform your understanding of the Agile mindset and lifecycle methodologies/frameworks. Let's take this opportunity to learn, engage, and grow together in the fascinating world of Agile!
The Extended Berkeley Packet Filter (eBPF) has emerged as a powerful and transformative technology, fundamentally changing how we instrument, observe, and secure modern computing systems. This talk will provide a comprehensive introduction to eBPF, starting with a clear explanation of what it is – a virtual machine within the Linux kernel that allows for the safe execution of user-defined programs without modifying kernel source code or loading kernel modules. We will then delve into its core components, including eBPF programs, maps, and helpers, illustrating how these elements interact to enable dynamic and efficient kernel-level functionality. A key focus will be on differentiating eBPF-based instrumentation from traditional methods. We will highlight why eBPF is inherently lighter-weight, offering significantly reduced overhead compared to conventional approaches like agents or code modification. Furthermore, we will explore eBPF's unique capability to instrument virtually any language or application running on a system, regardless of its original programming language or runtime, by leveraging its deep insights into kernel events and system calls. Attendees will gain a solid understanding of eBPF's architecture, its practical advantages for observability and security, and its potential to unlock unprecedented levels of system introspection and control
WebAssembly (Wasm) is no longer just for the browser—it's rapidly becoming a powerful tool in the cloud-native developer’s toolbox. In this fast-paced session, we will break down why Wasm is being adopted in cloud-native development, from lightweight microservices to secure sandboxable systems in service meshes. Drawing on industry data and real-world use cases, the talk will highlight Wasm’s key benefits—portability, performance, and security—and where it fits in modern platform and application architectures. Whether you’re curious about Wasm’s role in modern ecosystems or how it compares to Kubernetes or containers, this session will leave you with a clear picture of how WebAssembly can enhance cloud-native development.
In 2024, inspired by the Mass Open Cloud (MOC), and in collaboration with the MOC Alliance (MOC-A), Red Hat Waterford and the Walton Institute at SETU collaborated and partnered in establishing an experimental cloud for researchers, staff and students in both organisations. These are the first steps in extending the MOC-A into Europe. Already, this private/public sector collaboration has created a community of new OpenShift and RHEL users with bare metal access. It has unblocked our work with partners on multiple Horizon EU projects. It has been made possible by recycling a super computer, installed in the liquid cooled Walton data centre, with operations provided by SETU, tech support for the ops team provided by Red Hat, and initial testing by the RH-OCTO team in Waterford. Positive interest around the project has attracted additional infrastructure in the form of storage funded by SETU, and an IBM Z LinuxONE mainframe.
Do you need access to multiple multi node clusters on OpenShift AI on RHEL?
Do you need access to OpenShift on OpenStack?
Join us in Boston for a quick demo. Learn more about the current configuration, current and future users and usage, approaches to users and identity, VMs, fine tuning, and future plans for growth through self funding.
Model startup latency is a persistent bottleneck for modern inference workloads, particularly when using custom kernels written in Triton that are Just In Time (JIT) compiled. In this talk, we’ll present a novel approach to speeding up model boot times by wrapping Triton kernel caches in OCI container images.
We’ll demo a working prototype that packages Triton-generated LLVM Kernels into reusable, portable container layers. These "hot start" containers can be deployed directly to Kubernetes, bypassing costly JIT compilation and significantly reducing model startup time.
Whether you're building ML infrastructure, working with OSS compilers, or deploying models at scale, this talk offers practical techniques to optimise cold starts for Models using Triton-lang.
In today’s decentralized and cloud-native world, how do you ensure identity is managed securely, seamlessly, and at scale? What happens when your users span across multiple clusters, platforms, and regions—but still expect a unified login experience? Enter Keycloak, the open source identity and access management (IAM) hero we often overlook.
In this session, we’ll explore the Keycloak Chronicles—real-world lessons and hands-on strategies for implementing secure IAM in distributed architectures. Through interactive Q&A-style storytelling, we’ll tackle challenges like federated identity, fine-grained access control, and integrating Keycloak with modern workloads (Kubernetes, APIs, and microservices). Curious how to handle multi-tenant authentication or integrate with GitHub, LDAP, or SSO providers? We’ll answer that too. By the end, you’ll walk away with a clear roadmap to managing identity in a decentralized world—with Keycloak at the helm. Come with questions, leave with clarity.
As security threats evolve and compliance mandates tighten, developers and platform engineers are increasingly expected to bake security and crypto agility into their workflows from the start. In this talk, we'll explore how the open source HashiCorp Vault Community Edition can serve as a powerful cryptographic control plane not just for secrets, but for managing the full lifecycle of cryptographic assets including encryption keys, certificates, and algorithm policies. We'll discuss real-world scenarios where Vault enables centralized key management, streamlined certificate issuance and renewal, and integration with CI/CD pipelines for automated crypto operations. The session will also unpack why Post-Quantum Cryptography (PQC) matters now not later and how open source communities and forward-looking practitioners can extend Vault to experiment with PQC algorithms, support multiple cryptographic backends, and plan for seamless algorithm transitions. Whether you're a security-minded developer, a compliance lead, or an SRE looking to future-proof your infrastructure, this session will provide a practical blueprint for using community-driven tools to solve enterprise-scale security challenges without vendor lock-in.
Your Kubernetes clusters are running fine — but is your cloud bill?
As organizations adopt Kubernetes for its scalability and flexibility, cloud costs can often escalate before anyone realizes. Despite the availability of numerous cost-saving strategies, many fail to deliver meaningful impact without extensive architectural or tooling changes.
In this talk, I’ll walk you through three practical strategies that significantly reduced cloud spend in our Kubernetes/OpenShift environment — without sacrificing performance or stability. These approaches are grounded in real-world implementation and are designed to bring measurable benefits with minimal disruption.
We’ll explore:
- The core principles behind each strategy
- The prerequisites to consider before implementation
- The types of workloads best suited for each approach
To broaden your toolkit, the session will also include a curated list of industry-recognized cost optimization techniques, helping you determine which methods are most applicable to your infrastructure and business model.
Whether you're a platform engineer, SRE, or DevOps professional, you'll leave this session with actionable insights and a framework to make smarter, cost-conscious decisions for managing Kubernetes at scale.
⚡ Takeaways:
- Proven, field-tested cost optimization strategies
- Insight into workload-specific tuning
- Broader awareness of cost control techniques used across the industry
Actual content TBD - need to adjust to 10 minute window of lightning talk. It will be a subset of the original.
Original:
With the recent adoption of Podman by the Cloud Native Computing Foundation (CNCF) in the sandbox space, it has become increasingly important for developers and system administrators to familiarize themselves with its powerful capabilities. Podman is a rootless container engine that enables users to run containers, images, and volumes seamlessly across Linux, Windows, and macOS, offering enhanced security and flexibility in container management.
In this presentation, I will explore five often-overlooked features of Podman that can significantly enhance your container management experience. The talk will start with a brief introduction of Quadlets for declarative systemd configuration and integrations with tools like Cockpit and Ansible. We will also delve into the use of Podman commands such as Auto-Update, Healthcheck and Farm to improve your node management. Additionally, I will cover some new features that will help you with Healthchecks’ log location and log size of your containers and a quick demo with tips on how to make the best of Podman’s shell completion.
Join Christopher Nuland as he revisits the thrilling world of the 1990s arcade game Double Dragon, exploring advanced techniques in distributed AI training using Kubernetes and KubeRay on OpenShift. This session dives into the application of OpenShift to deploy game simulations across a cluster, enabling rapid AI training through distributed computing. Discover how the integration of KubeRay enhances these processes, significantly reducing the time required for training reinforcement learning models like Deep Q-Network (DQN) and Proximal Policy Optimization (PPO).
Witness firsthand how these technologies are applied to train AI agents that can master complex video games, demonstrating the power and scalability of OpenShift for AI training. The talk will cover practical steps for setting up and managing distributed training environments, optimizing resource usage, and achieving faster convergence times in AI model training.
Beyond gaming, Christopher will discuss the broader implications of these techniques in fields requiring large-scale AI solutions, such as healthcare and autonomous driving. The presentation aims to empower attendees with the knowledge to leverage Kubernetes and OpenShift in their AI projects, fostering innovation and efficiency in their operations.
In a world of increasing compliance requirements and heightened security expectations, FIPS (Federal Information Processing Standards) compliance is more than just a checkbox. But how do you ensure your artifacts truly meet FIPS standards?
This talk demystifies FIPS compliance for container images, what it covers, and how compliance is validated. We’ll explore check-payload, a lightweight, open source cli tool built to scan container images for FIPS compliance.
We will also demonstrate how we plugged this check into a secure CI/CD pipeline that leverages Tekton chains. The attendees will walk away with a clear understanding of what FIPS compliance entails, and some practical tools and patterns to integrate FIPS checks in their CI/CD workflows.
Kubernetes has transformed container orchestration, but managing Kubernetes clusters and Internal Developer Platforms (IDPs) at scale remains a complex challenge for platform engineers. While ClusterAPI offers a declarative way to provision and operate clusters, it often brings complexity through verbose YAML and integration difficulties.
This talk introduces an AIOps-driven approach that leverages ClusterAPI, Sveltos, and templated automation to enable end-to-end lifecycle management—provisioning, upgrades, and teardown—across diverse environments.
Through a real-world, large-scale platform demo, we’ll explore intelligent multi-cluster management, with enhanced observability, automated scaling, and proactive anomaly detection. Attendees will gain practical insights into building adaptive, resilient, and developer-friendly platform experiences on Kubernetes.
A gathering is an informal space for individual contributors in or adjacent to a specialized and timely topic to share (not critique or evaluate) ideas that are in progress or perhaps should be.
We want to see new and experienced speakers, alike. If you have been working on something in the realm of containers, submit a talk - sign ups will be on-site during the conference!
The format
- Each speaker has 10 minutes to share an idea.
- Ideas can be big or small. It can be show-and-tell of a personal project or a loosely sketched out paradigm shift.
- The speaker will indicate their desired next steps at the end, which can range from organizing a dedicated breakout session to a no-op.
Gathering Etiquette
- All participants should help speakers feel heard by giving speakers their full attention.
- We are all working towards building a shared vision and a shared understanding. This is not the time to work through implementation details or identify risks. Positive vibes only.
Info for potential speakers (that's you!)
- Each speaker represents themselves, unless otherwise specified.
- Sharing an idea is not a commitment to implementing the idea.
- Due to the informal nature of the gathering, there will not be an opportunity to use slides. But feel free to get creative!
- Limit your talk to 10 minutes.
We will allocate time on a first-come, first-serve basis. Although we do not curate the talks, we will ensure that topics pertain to the containerization space.
Large language models are powerful—but they’re also resource-intensive. Running them in production can be scary expensive without the right tooling and optimizations. That’s where vLLM and quantization come in: together, they offer a practical path to serving models at high speed and low cost, even on modest hardware.
In this workshop, you’ll learn how to combine vLLM’s high-performance serving engine with quantized models. Whether you're deploying to GPU servers in the cloud or smaller-scale on-prem environments, you’ll leave with the skills to drastically reduce inference latency and memory usage—without compromising output accuracy.
You’ll learn how to:
- Deploy quantized LLMs using vLLM’s OpenAI-compatible API
- Choose the right quantization formats for your hardware and use case
- Use tools like llm-compressor to generate optimized models
- Benchmark and compare performance across different quantization settings
- Tune vLLM configurations for throughput, latency, and memory efficiency
By the end of the session, you will know how to deploy your own quantized model on vLLM and apply these optimizations to your own production Gen AI stack.
In OpenShift, the default container datapath is implemented using veth pairs connected to an OVS bridge, relying on kernel-space networking. With the introduction of DPDK and VDUSE (vDPA devices in userspace), OpenShift now not only offers a userspace datapath solution for container networking, but also unlocks the benefits of DPDK to containerized workloads—similar to the vhost-user solution used in traditional virtualized environments.
But what are the actual performance characteristics of the userspace datapath in OpenShift? And how does it compare to the kernel-space datapath?
In this talk, we will first present the technical architecture of the VDUSE OVS-DPDK userspace datapath solution we are implementing in OpenShift. We will then answer these questions by showcasing real-world benchmark results, focusing on key metrics such as latency, throughput, packets per second (PPS), among others.
As LLMs move into enterprise workflows, developers face a new kind of architecture challenge: how do you build reliable, interpretable systems powered by agents and reasoning?
This talk unpacks how we designed and implemented an AI orchestration framework for enterprise architecture — combining LangGraph for multi-agent workflows, Flyte for distributed execution, and AWS Bedrock for LLM inference using Claude 3. The product: an AI copilot for enterprise architects, deeply rooted in your tech stack context.
At the core of this system is a domain-specific knowledge graph that acts as long-term memory for the agents. It enables persistent, structured representations of architectural state, system dependencies, and business context — giving the agents the grounding they need to generate accurate recommendations, translate natural language into SQL or code, and maintain continuity across workflows.
We’ll also cover how we’ve integrated observability practices — including planned OpenTelemetry instrumentation — to trace and debug autonomous AI systems in production.
If you’re a developer or AI engineer thinking beyond the chatbot and looking to embed reasoning into complex system design and data tasks, this talk offers an end-to-end blueprint — from orchestration and grounding to production monitoring.
Tired of manually triggering deployment pipelines and tracking releases across environments?
Let’s explore how to build a truly automated, self-sustaining software delivery process.
Despite advancements in CI/CD automation, deployments and releases continue to be a significant operational burden for most organizations.
This talk doesn’t focus on specific tools or products but instead walks you through a production-proven approach for streamlining software delivery by tightly integrating:
- Project management tools like Jira as the source of truth
- CI/CD pipelines for build, test, and deployment
- Cloud-native event-driven services to glue everything together
At the core of this system is the integration of Jira workflows with your automation stack. Every state change in Jira (such as moving a ticket to “Ready for Release”) can automatically trigger specific downstream actions in your CI/CD pipeline for deployments across various environments based on conditions being satisfied. On a high level view, this is made possible by:
- Jira webhooks to listen to workflow transitions
- An edge computing service (e.g., Azure Functions, AWS Lambda, or any cloud equivalent) to implement business logic
- Your existing CI/CD stack to perform the actual deployment and verification tasks
By using Jira as the single source of truth, teams gain real-time visibility, traceability, and a structured release process—without requiring manual intervention.
This approach allows teams to:
- Reduce operational overhead and context switching
- Simplify governance and auditing
- Improve release reliability and production stability
- Accelerate time-to-production through hands-free execution
Join us as we walk through our real-world implementation and learn how to modernize software delivery by connecting planning with execution—from idea to deployment, all in a streamlined and scalable way for your existing CICD setup.
This session gives insights into how the team strategically transformed a enterprise customer's operations with the use of AWX, the learnings and the challenges faced during the entire collaboration.
Configuration-as-Code for Streamlined Deployment: Implementing a configuration as code framework has standardised deployment processes across the automation vertical, significantly improved consistency, accelerated time to market and reduced deployment errors.
Chargeback Model: A newly developed chargeback system that ensures transparent financial accountability for internal clients. This model drives efficient resource utilisation, enabling our teams to align usage with costs effectively.
Enhanced Security and Compliance using Event based Automation: Leveraging Red Hat Insights have strengthened the adherence to rigorous security and regulatory standards. This approach mitigates risk, ensuring that all automation aligns with the bank’s high security and compliance benchmarks.
Empowering Internal Customers with a Robust Release Management Framework: This framework is tailored to our internal customers, enabling them to create, version, and release their own Ansible code with confidence.
Let's join to elevate the Automation journey with AWX/Ansible Automation Platform.
In this talk, we’ll explore in detail about Remote Code Execution (RCE) and Arbitrary Command Execution attacks by diving into real-world vulnerabilities. I intend to explain how attackers exploit popular open source libraries through specific CVEs.
Vulnerabilities that we will look into: (See notes for detailed explanation)
-
CVE-2024-47076: cups-filters:
A vulnerability in CUPS allows attackers to exploit a flaw in how it processes print requests. By sending a malformed request, an attacker can trigger a memory issue, potentially leading to the attacker taking control of the system. -
CVE-2024-6345: python-setuptools:
Attackers can leverage weaknesses in the package_index module to run arbitrary code during package downloads, potentially compromising entire Python build environments. -
CVE-2024-32002: git:
A vulnerability enables code execution during the cloning of local repositories, posing a risk to version control workflows.
This session includes a live demo showcasing an attack scenario in a controlled environment, providing attendees with practical insights into exploit execution.
Scaling GitOps for large-scale deployments can be challenging with a single repository or controller. This talk explores sharding as a strategy to optimize performance, improve reliability, and manage complexity in GitOps workflows for multi-environment or multi-tenant setups.
Most developers use container base images without fully understanding their security posture. Even “minimal” or “hardened” images can contain vulnerabilities, and static security choices alone aren’t enough.This session will test commonly used container images—Alpine, Debian, Ubuntu, and Distroless—to reveal how many vulnerabilities they contain. We’ll explore why base image security is a moving target and how teams can ensure long-term security without constant manual intervention.
Live Demo:
• Scan widely used container images to reveal hidden vulnerabilities.
• Compare audience predictions vs. real-time scan results.
• Apply automated remediation to show how security can be continuously maintained.
Everyone loves writing tests, don’t they? How do you write good tests? What tools are available for you to write good tests?
In this session, I will dive into the many features of Quarkus that help developers write good tests. I will highlight some of the features of Quarkus, Dev Services and Continuous Testing, which help make testing easier. Additionally, I will live code some tests for common use cases developers encounter, including unit, integration, and “black box” testing of imperative and reactive RESTful and event-driven applications that use common services, such as databases and Kafka brokers. I will discuss techniques such as mocking, spying, and interaction-based testing/verification.
I'll even spend some time showing how IDE-based AI assistants can help!
Once you see how easy TDD really can be there isn't a reason to not do it!
n today’s fast-paced, data-driven world, accessing and analyzing structured data efficiently is essential for timely decision-making. Yet, traditional SQL-based querying often limits accessibility to technical users. IBM’s Semantic Layer, a key component of the Data Intelligence Platform, eliminates this barrier by allowing business users to interact with complex data using natural language.
In this session, we’ll showcase how IBM leverages generative AI to enrich metadata and drive natural language to SQL (Text2SQL) transformations—empowering users across the enterprise. We’ll explore the technical architecture, including model selection, semantic parsing, schema linking, enrichment pipelines, and query optimization. You’ll see real-world examples of a conversational analytics agent that delivers deeper insights from business data.
This solution is deployed on IBM’s Multi-Cloud SaaS Platform, powered by AWS services and running on Red Hat OpenShift. Whether you're a developer, data scientist, or business stakeholder, this session will demonstrate how IBM and AWS together enable smarter, more intuitive ways to work with data.
Service mesh debugging often feels like spelunking blindfolded. Understanding what’s happening inside an Envoy proxy—especially within a Consul service mesh—can be opaque, time-consuming, and disruptive to production workloads. Enter xDSnap: a lightweight, Go-based open-source tool that brings visibility into the real-time state of Envoy sidecars running in Kubernetes.
In this lightning talk, we’ll explore how xDSnap captures and organizes Envoy config dumps, statistics, and logs from Consul dataplanes—giving platform engineers a snapshot they can rely on for diagnosing service discovery issues, xDS update delays, and misconfigurations. We'll cover:
How xDSnap was inspired by tools like ksniff and netshoot
Its Kubernetes-native, resource-conscious architecture
Real-world debugging workflows it accelerates
Upcoming features like a background agent for real-time data collection and automated analysis
Whether you’re operating in a multi-tenant mesh or troubleshooting service routing failures, xDSnap helps you cut through the noise—automatically.
ARA (ARA Records Ansible) is an Ansible development tool that makes it much easier to understand, troubleshoot and debug Ansible content during development process. This tool can also help you to collaborate with your team members on Ansible content development. This talk will cover the following topics: What is ARA and how it works How to set up ARA in your environment. How to use ARA to understand, troubleshoot and debug Ansible content. How to use ARA to collaborate with your team members on Ansible content development. How to integrate ARA into your CI/CD pipeline. How to use ARA to track changes in your Ansible content.
The DevSecOps environment is evolving rapidly—it's no longer a matter of integrating security scans into legacy CI/CD pipelines. Instead, we're headed toward an intelligent, adaptive workflows future powered by AI. In this session, we explore how integrating machine learning models into DevSecOps practices makes software delivery more intelligent and more secure.
We’ll dive into practical examples of AI-enhanced pipelines that go beyond basic automation. These advanced workflows are capable of proactively identifying vulnerabilities, detecting anomalies in real-time, and streamlining incident response. By embedding machine learning into popular CI/CD tools like GitLab and Tekton, teams can implement predictive testing and dynamic compliance validation that evolves with the system, improving both security and efficiency.
Attendees will discover real-world use cases and strategies drawn from large-scale enterprise environments, where AI has significantly boosted security posture without slowing down development. We’ll demonstrate how to set up intelligent pipelines that not only react to issues but anticipate them, enabling a more resilient and agile approach to DevSecOps.
If you're interested in scaling secure deployment, gaining more visibility, or eliminating manual overhead, this session presents a view of the future of how AI can be used as a force multiplier. This is ideal for DevOps, security teams, and architects that want to upscale their toolchain and stay in front of a more complex threat landscape.
As quantum computers rapidly increase in size and relevance, all public key systems that rely solely on RSA or elliptic curve cryptography are under threat. So, how do you protect your software supply chain? If you sign your software or artifacts, this talk is for you.
We will explore the implementation of post-quantum cryptography (PQC) for digital signing workflows specifically, with a focus on NIST recommendations and FIPS-compliant algorithms (like ML-DSA and SLH-DSA), as well as recent industry advances. Then, we will discuss open-source solutions, highlight the contributions of open-source developers, and consider implementation challenges you may experience in this fast-evolving cryptographic landscape.
You will leave this talk with a solid understanding of the current state of PQC in open source, and the knowledge to implement the best solutions for your software signing needs.
In this Modern AI age, we need proactive response, not just reactive fixes. In this session, we explore how AI-driven event detection combined with Event-Driven Ansible (EDA) enables systems to autonomously predict, diagnose, and remediate issues—transforming IT operations into self-healing ecosystems. We will talk about
- RHEL Security issues captured by Insights -> Sending an alert to EDA - and then the magic of AI
Traditional approaches to improving AI model performance—scaling model size or training data—are increasingly constrained by cost, latency, and diminishing returns.
Inference-Time Scaling (ITS) offers an orthogonal solution by optimizing how computational resources are allocated during inference.By restructuring search and evaluation strategies at test-time, ITS significantly enhances model output quality without retraining or expanding model parameters.
In this talk, we will introduce the top methods of ITS, and how you can try it on your existing models using off-the-shelf tool-kits such as reward_hub and inference_time_scaling library.
In the rapidly evolving landscape of software development, integrating artificial intelligence (AI) into command line interface (CLI) applications offers a powerful way to enhance productivity, streamline workflows, and drive innovation. This lightning talk will explore the transformative potential of CLI-based AI apps, demonstrating how developers can leverage these tools to automate complex tasks, improve decision-making, and accelerate development processes.
Key areas covered in this talk include:
-
Introduction to CLI-Based AI Apps:
- Understanding the synergy between CLI applications and AI technologies.
- Highlighting the benefits of using CLI tools for AI-driven tasks, including speed, efficiency, and scalability. -
Building AI-Powered CLI Tools:
- A practical guide to creating CLI applications that incorporate AI functionalities using languages like JS, Python.
- Demonstrating how to integrate machine learning models into CLI tools to perform various tasks like data analysis,, and natural language processing. -
Enhancing Developer Productivity with AI:
Real-world examples of how AI-powered CLI apps can automate repetitive tasks, reduce errors, and provide intelligent insights.
Join us for a night of fun at Bleacher Bar located right under the Fenway stadium! It is about a 20 mins walk from the conference venue.
The party is open to all on a first come, first serve basis so get there early! Conference badges and ID are required to enter.
Address: 82A Lansdowne St, Boston, MA 02215.
Boot time is a critical KPI in the automotive industry. This talk will recap how boot time was captured on a previous platform and the challenges encountered when transitioning to a new one. This session will explore new observability gaps uncovered, the optimizations applied, and their impact on overall boot performance. It will cover measurement methodologies, user-space optimizations, and platform-specific tuning that contributed to reducing boot time.
Meet and network with the following communities on the show floor!
- Red Hat
- Foreman, Katello, and Pulp Community Booth
- Podman
- RISC-V
- Ubuntu
- UXD
In this hands-on workshop, participants will learn how to develop powerful AI agents that can revolutionize IT infrastructure management through natural language interactions. As infrastructure complexity grows, the ability to leverage AI to simplify operations becomes increasingly valuable. This workshop will guide attendees through the complete process of building a Python-based AI assistant that can interpret natural language requests and execute complex infrastructure tasks.
Participants will learn how to:
Architect an AI agent that connects large language models (LLMs) with backend systems
Implement secure API integrations between AI models and infrastructure components
Design conversation flows that translate natural language into precise technical operations
Build a web interface for seamless interaction with the AI assistant
Implement error handling and security best practices for production-ready agents
This workshop combines practical coding sessions with architectural insights, enabling participants to create their own customizable AI infrastructure assistants. By the end, attendees will have a working prototype they can extend to manage various aspects of their IT ecosystem.
Suitable for DevOps engineers, system administrators, and developers with basic Python knowledge, this workshop provides the foundation for building the next generation of intelligent infrastructure management tools that dramatically enhance team productivity and system reliability.RetryClaude can make mistakes. Please double-check responses.
Requirements: Basics of Python
Artificial intelligence is revolutionizing industries, from healthcare to education — but few realize it’s also quietly heating the planet. Training massive AI models like GPT-4 or Gemini requires immense computational power, leading to carbon emissions on par with major industries.
In this talk, we’ll uncover the hidden environmental cost of AI development, with real-world examples of the carbon footprint behind today’s most popular models. More importantly, we’ll explore how the emerging “Green AI” movement is pushing for more sustainable practices, including efficient model design, renewable-powered data centers, and carbon-conscious AI innovation. Attendees will walk away with a deeper understanding of why building a smarter AI future must also mean building a greener one — and how they can be part of that change.
Are you struggling to debug Linux and Windows nodes without direct host access? The Node Log Query feature, introduced in Kubernetes 1.30 via KEP-2258 [1], offers a game-changing solution. This session will show you how to leverage this standardized API to retrieve logs from Windows nodes, streamlining troubleshooting and enhancing debugging.
Through live demos using kubectl CLI and plugins, we’ll explore real-world use cases, such as debugging a Windows service failure and monitoring node health in a hybrid Kubernetes cluster. We’ll also discuss security best practices and demonstrate integration with other platforms. Whether you’re a developer, operator, or DevOps engineer, you’ll gain actionable insights to elevate your qualitative process to
record what happened and when it happened, especially on Windows nodes.
[1] https://github.com/kubernetes/enhancements/issues/2258
This talk will cover using AI with RamaLama. Will cover new features and a roadmap to the future.
RAG apps save up to 60% of the cost compared to standard LLMs. But in this talk, I will tell you a way that saves you more $$ on top of that because 2025 will all be about optimising the cost of building LLMs and its apps. RAGCache tackles these bottlenecks with cutting-edge techniques:
- 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗖𝗮𝗰𝗵𝗶𝗻𝗴: Stores intermediate states in a structured knowledge tree, balancing GPU and host memory usage.
- 𝗘𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗥𝗲𝗽𝗹𝗮𝗰𝗲𝗺𝗲𝗻𝘁 𝗣𝗼𝗹𝗶𝗰𝘆: Tailored for LLM inference and RAG retrieval patterns.
- 𝗦𝗲𝗮𝗺𝗹𝗲𝘀𝘀 𝗢𝘃𝗲𝗿𝗹𝗮𝗽: Combines retrieval and inference to minimize latency.
Integrating RAGCache with tools like vLLM and Faiss delivers:
- 𝟰𝘅 𝗙𝗮𝘀𝘁𝗲𝗿 Time to First Token (TTFT).
- 𝟮.𝟭𝘅 𝗧𝗵𝗿𝗼𝘂𝗴𝗵𝗽𝘂𝘁 𝗕𝗼𝗼𝘀𝘁, optimizing latency and computational efficiency.
The talk goes through:
1. Current challenges of RAG
2. A solution that reduces cost and improves user experience
3. How does it work?
4. How well does it perform?
5. What are the key benefits?
6. Lastly, a few real-world applications
Generative AI's transformative potential is undeniable, yet its true power lies in addressing real-world challenges. This lightning talk will pivot from the pervasive hype surrounding Gen AI to the foundational necessity of identifying strong, use case-driven applications. We'll argue that defining clear problems and understanding user needs are the crucial first steps in building successful AI products and businesses, ensuring that technological marvel translates into tangible value. Crucially, this talk will conclude with solid, contrasting examples of compelling AI use cases currently being implemented across various industries versus examples of poorly conceived or impractical applications. Attendees will gain actionable insights on how to move beyond the buzz and strategically pinpoint high-potential AI opportunities.
We hear from customers around the world that their ability to create value for their end customers, both internal and external, depends on your ability to deliver applications and services faster.
Containers and Kubernetes fundamentally enable DevOps, microservices, AI/ML and cloud-native application development, and which is deployed across the hybrid cloud. Red Hat Enterprise Linux (RHEL) bootc images enable you to build, deploy, and manage the operating system as if it is any other container. You can converge on a single container-native workflow to manage everything from your applications to the underlying OS.
In this presentation you will learn how to use image mode for RHEL to build, test, and deploy operating systems by using the same tools and techniques as application containers. Image mode for RHEL uses the same tools, skills, and patterns as containerized applications to deliver an operating system that is easy to build, ship, and run. This presentation will cover the concepts behind image mode and help introduce foundational concepts required to package operating systems in Open Container Initiative (OCI) container images.
We will also introduce artificial intelligence (AI) concepts, models, and application recipes to allow users to explore how to package AI tools as containers and prepare them for installation as a bootable container image.
This presentation is appropriate for developers, system administrators, and data scientists interested in building and packaging AI tools. You'll learn and understand the concepts behind RHEL image mode.
Midori is a web browser focused on privacy, security, and productivity, which includes applications such as VPN, open source, AI-powered search engine, and tools for product development and construction.
As much as the industry likes to tout the good word of DevOps, there has always seemed to be a hierarchy. First comes the Dev. And then comes the Ops. It’s right there in the name! What if instead we were able to build out both application and operational implementations at the same time? Instead of Ops being the afterthought or the thing that we worry about after launch, we can instead ensure that these are treated with equal value and priority! How you might ask? With tools like Devcontainers and Grafana’s LGTM stack, we can take application and operational development with us wherever we go! Now instead of having to wait until our applications are deployed to some external environment, we can begin hacking at our code and dashboards right away. The benefit here is that now you can begin shipping both your artifacts and requirements right away, instead of having to wait until you ship your code to find out that your observability components might not work at all! In this talk we’ll introduce attendees to Devcontainers, how to set up an environment that includes your application and an observability environment and show a small example of how you can increase your operational and development velocity by utilizing this approach.
In this talk, we explore the practical challenges and engineering trade-offs involved in building multi-agent AI systems.
After a brief overview of what constitutes an AI agent, we’ll trace the evolution of agentic workflows—from the explosive interest in AutoGPT in 2023, through the rise of visually empowered agents in 2024, to today’s surge of agentic AI across industries. We’ll then examine where and why businesses should deploy multi-agent architectures, highlighting key use cases in SaaS automation and domain-specific “Vertical AI.”
On the engineering side, we dive into three core pain points:
Inference Cost vs. Performance
- Balancing API or hardware expenses against tool-calling accuracy and latency
- Strategies for using smaller, specialized models—enhanced via prompt engineering and reinforcement learning—to match or exceed larger, general-purpose alternatives
Memory Management
* Physical layer: vector stores, knowledge graphs and prompt-caching frameworks (e.g., Llama-Stack, AutoGen)
* Logical layer: shared context protocols that minimize redundant inference and shrink effective context windows
Agent-to-Agent Communication
* Physical/transport layer: HTTP-based JSON-RPC 2.0 with async-first patterns (polling, SSE, webhooks)
* Logical layer: pipelined and host-agent orchestration workflows
We’ll illustrate these concepts with benchmark data and a pre-recorded demo showcasing a full agentic pipeline—MCP function calling, A2A orchestration, and memory-driven optimization.
Finally, we’ll close with future directions for agentic systems and open the floor for questions.
Your judgment‑free zone for open source AI innovation and exploration. Every story—whether triumphant, hilarious, or face‑plant—fuels our collective learning.
In this interactive session, we invite developers, data scientists, tinkerers, and curious onlookers to share their open source AI exploits in any form. Our format is simple: volunteers sign up on the spot, step up to the mic, and share what they think we should know for 5–7 minutes. The audience will respond with positive-vibe improv—-“Yes, and…”
That’s it. You’ll leave with fresh connections and potential collaborators as we build the future of open source AI together.
Docling, an open source package, is rapidly becoming the de facto standard for document parsing and export in the Python community. Earning close to 30,000 GitHub in less than one year and now part of the Linux AI & Data Foundation. Docling is redefining document AI with its ease and speed of use. In this session, we’ll introduce Docling and its features, including how:
- Support for a wide array of formats—such as PDFs, DOCX, PPTX, HTML, images, and Markdown—and easy conversion to structured Markdown or JSON.
- Advanced document understanding through capture of intricate page layouts, reading order, and table structures—ideal for complex analysis.
- Integration of the DoclingDocument format with popular AI frameworks—such as LlamaIndex. LangChain, LlamaStack for retrieval-augmented generation (RAG) and QA applications.
- Optical character recognition (OCR) support for scanned documents.
- Support of Visual Language Models like SmolDocling created in collaboration with Hugging Face.
- A user-friendly command line interface (CLI) and MCP connectors for developers.
- How to use it as-a-service and at scale by deploy your own docling-serve.
Transitioning into management comes with a set of challenges that aren’t always obvious at the start. This talk outlines what new managers should expect across three core areas: managing teams, managing individuals, and managing yourself. We’ll discuss how to navigate the nuances of these areas to help you be a successful manager.
We’ll explore issues such as lack of team vision, neglecting leadership development, and hesitating to ask basic questions. We’ll also reflect on the pressures of being a manager and how that can affect your own performance.
Throughout the session, we’ll share real examples of mistakes we made early in our management careers, how we addressed them, and what we learned in the process. We’ve made these mistakes, so you don’t have to.
As the online world continues to evolve, the intersection of Web3 and container solutions is creating a new era of decentralized cloud infrastructure. With their portability, scalability, and efficiency, containers are becoming the building blocks of choice for deploying and operating decentralized applications (dApps) and distributed computing environments in the Web3 world. By virtualizing infrastructure and offering uniform runtime environments, containers allow developers to build, deploy, and manage services across geographically dispersed nodes—a basic ability for decentralized systems.
There is a new trend here that comes with new and difficult security implications. Traditional security models depend on central control and fixed boundaries, which are not easily applicable to decentralized settings. In Web3 environments, where control, execution, and data are distributed, trust and integrity are much harder to ensure. Containers in such contexts introduce new kinds of threats, such as vulnerabilities in base images, insecure communication between nodes, and the threat of tainted containers impacting a bigger network of services.
This session will explain how containers are fueling the decentralized cloud in Web3, both its potential and risk involved in its security. It will also examine ways to address the risks, including cryptographic identity, zero-trust networking, secure orchestration, and immutable infrastructure. Everyone in attendance will learn about the best ways to protect, grow, containerized solutions in a very decentralized world and how to address the evolving threat landscape coming their way before it gets the better of them.
CPU architectures for enterprise applications have been a monoculture dominated by x86 since the advent of the modern PC and server in the 1990s. But in the past decade, things have been changing fast. Availability of easy-to-use Arm64 sillicon at a low price point with Raspberry Pi advanced the open source ecosystem and vastly improved Arm64 support and performance by distributions like Fedora and Debian. Apple choosing Arm64 as the architecture for their laptops, combined with the popularity of Arm hardware with cloud application developers, really raised developer awareness. And the advent of server-grade Arm64 silicon in cloud server providers - either home-grown or based on Ampere CPUS - at a price point typically 20-30% cheaper per core hour than equivalent x86 instances has massively increased adoption of Arm64 in the cloud, and given a boost to enterprise support for the architecture through the cloud native ISV ecosystem.
Adding a new architecture to your application infrastructure brings some risk. Some of the questions I hear all the time are:
- What are the benefits of running my cloud applications on Ampere or other Arm64 instances?
- What workloads should I move first, and how can I pick & choose which application components run on which architecture
- How do I know if this architecture is supported by all of my dependencies?
- What do I need to do to rebuild and requalify my software for Ampere servers?
- How can I containerize my software for multiple architectures, and how will workload placement on Kubernetes work?
- Are any of the supporting functions for cloud applications (log management, observability, service mesh, etc) impacted?
This presentation will answer all of these questions. We will walk through a couple of case studies of people who saw both performance improvements and cost reduction when moving to Ampere-powered Arm64 instances. Then we will look at how, practically, you can update your CI processes to make multi-architecture containers using Podman and friends. Finally, we will talk about workload placement on Kubernetes, and how to use ArgoCD Rollouts to give yourself a safety net and incrementally migrate your application to Arm64.
Effectively deploying Large Language Models (LLMs) in Kubernetes is critical for modern AI workloads, and vLLM has emerged as a leading open-source project for LLM inference serving. This session will explore the unique features of vLLM, which set it apart by maximizing throughput and minimizing resource usage. We’ll explore the lifecycle of deploying AI/LLM workloads on Kubernetes, focusing on achieving seamless containerization, efficient scaling with Kubernetes-native tools, and robust monitoring to ensure reliable operations.
By simplifying complex workloads and optimizing performance, vLLM drives innovation in scalable and efficient LLM deployment by leveraging features like dynamic batching and distributed serving, making advanced inference accessible for diverse and demanding use cases. Join us to learn why vLLM is shaping the future of LLM serving and how it integrates into Kubernetes to deliver reliable, cost-effective, and high-performance AI systems.
Large Language Models (LLM) require preprocessing vast amounts of data, a process that can span days due to its complexity and scale, often involving PetaBytes of data. This talk demonstrates how Kubeflow Pipelines (KFP) simplify LLM data processing with flexibility, repeatability, and scalability. These pipelines are being used daily at IBM Research to build indemnified LLMs tailored for enterprise applications.
Different data preparation toolkits are built on Kubernetes, Rust, Slurm, or Spark. How would you choose one for your own LLM experiments or enterprise use cases and why should you consider Kubernetes and KFP?
This talk describes how open source Data Prep Toolkit leverages KFP and KubeRay for scalable pipeline orchestration, e.g. deduplication, content classification, and tokenization.
We share challenges, lessons, and insights from our experience with KFP, highlighting its applicability for diverse LLM tasks, such as data preprocessing, RAG retrieval, and model fine-tuning.
In today’s rapidly changing software landscape, technical leaders play a pivotal role in shaping both projects and team dynamics. The true impact of a technical leader extends far beyond writing code. This session is designed for experienced engineers looking to elevate their influence and drive meaningful change within their teams and organizations.
We’ll cover:
- Defining the role of a technical leader and the nuances that differentiate it from management
- Strategies for navigating complex technical decisions and aligning them with business goals
- Techniques for fostering collaboration and trust across the organization
- The importance of mentorship and how to cultivate the next generation of technical talent
Join us to discover how to leverage your technical expertise to inspire and empower others, creating a culture of innovation and excellence. Attendees will leave with actionable insights to enhance their leadership skills and amplify their impact in their organizations.
P2CODE, a project funded by the Horizon Europe Research Programme, advances the IoT-edge-cloud computing continuum by simplifying application deployment on IoT devices. The P2CODE platform automatically manages infrastructure across diverse domains - including 5G, RAN networks, and cloud environments - enabling developers to focus on application development. Furthermore, the platform offers a device registration framework for seamless integration of IoT devices into a centrally managed resource pool. A unified view of system and application telemetry is valuable for application lifecycle management.
In this session, we will explore how the P2CODE platform streamlines application development and deployment across the continuum. We will showcase the platform’s capabilities through two compelling use cases: (1) Drone and ground vehicles coordination for search and rescue operations, and (2) Monitoring manufacturing operators via exoskeletons and sensors. This talk is of particular interest to DevOps engineers working at the intersection of IoT, edge, and cloud computing.
In this talk, I will introduce Red Hat and Fedora CoreOS, exploring the role of these operating systems in the container ecosystem and within OpenShift. We'll look at their key features, such as automatic updates and ease of customization, and how they support a reliable and efficient environment for container orchestration. I’ll also explain some of the recent changes and improvements in RHCOS and FCOS, and how they reflect ongoing efforts to bring more flexibility, stability, and innovation to production environments.
Quantum computing is transforming disease detection by enabling powerful machine learning models on cutting-edge hardware. In this session, we’ll explore the fundamentals of quantum computing, its role in medical diagnostics, and the unique capabilities of cloud-accessible quantum systems. Learn how quantum machine learning models are trained, the challenges and opportunities in this emerging field, and what the future holds as quantum hardware evolves. With practical insights and real-world examples, this talk provides a comprehensive introduction to applying quantum computing in healthcare for precision and innovation.
An overview of the current state of software ecosystem readiness on RISC-V
Have you ever had to deal with training machine learning models where the data is very large? If the data does not fit in main memory, then how can you use GPUs if their memories are even smaller? Many of these cases require strategies for handling data sets. In this presentation, we will introduce DASF, a framework that brings together lazy data loading techniques using Dask, acceleration techniques using RAPIDS AI, and other techniques that facilitate the use of large data in ML pipelines locally or in HPC environments. We will also present a show case carried out with a company in the oil and gas sector.
I discovered Go language but didn’t have any time to follow multiples tutorials to learn it. But one day I discovered Træfik reverse-proxy project when I wanted to switch my infrastructure into fully dockerized one. I’m Træfik user since v1.4 but after many months using it I encountered an issue : there were no caching system in this reverse-proxy. I scrolled over the internet to know if any solution exists but nothing appears.
Then I decided to write my own Træfik cache system, but the main question was “Which language?” - PHP ? Nah. - Nodejs ? What a joke ! - C++ ? I didn’t learn this language at school and it’s really insane to learn.
Then I was on Træfik github repository when I decided to write it in Go. Another good point: that’s compatible with docker integration.
So I started the project and called it Souin Let’s see together how I bring it up from code to deployment.
This talk provides an in-depth, technical discussion of how OpenShift clusters are created. Beginning with the design of the installer and following an installation linearly, the talk explains how the installer uses a directed-acyclic asset graph to generate resources required for a cluster, bundles those resources into ignition, creates cloud infrastructure using cluster-api, and depends on cluster operators to complete installation. This talk will also explain the recent replacement of Terraform with cluster-api controllers.
Podman has gained significant traction in the container ecosystem, but how is it really doing? This session moves beyond anecdotal evidence to provide a data-driven "health check" on the Podman project using statistics and trend analysis.
We'll delve into key questions, including:
- Community Health: How engaged and vibrant is the Podman upstream community?
- Performance & Footprint: How does Podman measure up in terms of performance benchmarks and resource usage?
- Comparative Landscape: How does Podman stack up against other container management tools in the ecosystem?
Join us for an objective analysis of Podman's current state and trajectory based on real-world data.
Objectives:
Attendees will learn:
- Methods for gathering and analyzing GitHub project data.
- Key metrics and trends to monitor for open-source project health.
- Practical approaches and reusable code snippets for performing similar analyses on their own projects.
Target Audience:
This talk is ideal for project maintainers, open-source community members, developers, and anyone interested in data-driven project analysis.
In this talk, we will introduce SDG Hub, an open-source toolkit developed at Red Hat for customizing language models using synthetic data. We will begin by unpacking what synthetic data means in the context of LLMs, and how it enables model customization.
The session will explore SDG Hub’s core components: prompts, blocks, and flows, and demonstrate how users can compose, extend, or modify pipelines to fit specific tasks. It will also cover strategies for choosing the right teacher model depending on the use case (reasoning, translation, etc.), and walk through two real-world examples: building a document-grounded skill using a pre-built pipeline, and customizing a reasoning model by authoring new blocks, prompts, flows, and integrating a custom teacher.
The talk will conclude with a demo of the new SDG Hub GUI, showcasing how non-experts can visually construct and manage their own synthetic data pipelines.
Agile. Many engineers cringe at the word, associating it with overhead, planning, and formality—seemingly at odds with the flexibility needed to solve complex problems.
But what if I told you that you were actually born agile? I'm sure you’ll hate me, but stick around, and you might see I’m not that far off.
Agile isn’t just a business methodology; it mirrors how we naturally grow and learn. We start life without a plan, learning through trial, error, and support—just like our careers and projects. We iterate, refine, and adapt rather than having everything figured out from the start.
So why do engineers resist agile if it’s so innate? This talk explores that tension, uncovering how the same mentorship, collaboration, and continuous learning you grew up with can help teams move past frustration and fully embrace agility.
This talk explores how platform engineering teams can build a self-service AI/ML infrastructure on Kubernetes—enabling dynamic provisioning, policy enforcement, and full observability—while allowing data scientists to stay focused on model development.
While ClusterAPI simplifies cluster provisioning, managing AI/ML workloads demands a full-lifecycle approach. We demonstrate how to extend ClusterAPI with tools like Backstage, OpenFeature, k0s, Prometheus, Sveltos, and k0rdent to build a scalable, secure, and automated AI/ML platform.
Key takeaways include:
- AI/ML cluster provisioning with ClusterAPI and k0s
- Policy-driven multi-cluster automation
- Controlled model rollouts with feature flags
- Self-service ML environments via an Internal Developer Platform
- Observability and performance monitoring at scale
Join me to discover how a Kubernetes-native approach can empower your AI/ML platform engineering journey.
Let’s get real and talk about a production use case serving millions of people …
How we are using GenAI to improve our operations of our OpenShift/k8s environment to improve customer experience.
Join this session to explore how integrating supported open source, unified workflows, advanced monitoring, and strong governance—combined with the power of GenAI—has elevated the quality and reliability that we bring to our customers; while unlocking new creative possibilities for developers.
Join us for an overview of all the latest methods of language model post-training openly available today! We will begin with offline methods like standard Supervised Fine-Tuning (SFT), Parameter-Efficient Fine-Tuning (PEFT), Direct Preference Optimization (DPO), and continual learning techniques for further tuning existing instruct models. We will then move into online reinforcement learning options like Reinforcement Learning from Human Feedback (RLHF) and Group Relative Policy Optimization (GRPO). The talk will consist of a walkthrough of the use-cases for each method, as well as how to get started today via our very own Training Hub!
Domain Driven Design is designed for implementing complicated business logic and is an excellent fit for microservices development.
Domain Driven Design provides a repeatable, logical structure that makes implementing business logic easier, faster, and more maintainable. Hexagonal Architecture (or Ports and Adapters) excels at producing loosely coupled, interchangeable components that fit well with DDD.
In this presentation I will introduce the Domain Driven Design and dive into the DDD concepts of Aggregates, Repositories, Value Objects, Services, Ubiquitous Language, Adapters, and Shared Kernels. I will also build an application using these patterns and leverage Hexagonal Architecture for easy extensibility. Testing will of course be included.
You will leave this presentation with a basic knowledge of Domain Driven Design, how to structure and test your application to implement DDD and how to use Hexagonal Architecture to extend your applications.
No slides; just live code.
This presentation will explore some of the systemic but often hidden barriers faced by folks who don’t fit traditional engineer/developer stereotypes that exist within the tech industry, open source or otherwise. Topics will include the ways in which these inequities manifest, the impact they have on individuals seeking employment, and practical strategies for navigating these challenges.
My goal is to empower attendees to recognize the value of their strengths and experience, and provide them with actionable insights and tools to recognize and dismantle barriers, and ultimately find fulfilling career paths in an industry that may feel out of reach. I will also share real-life examples and stories to illustrate these points and encourage open dialogue.
Modern LLM applications demand reliable, reproducible performance numbers that reflect real-world serving conditions. This tutorial-style presentation walks attendees through every step required to collect meaningful inference benchmarks on consumer or datacenter NVIDIA GPUs using an entirely open-source stack on Fedora. Beginning with enabling RPM Fusion and installing the akmod-nvidia driver, we show how to validate hardware visibility with nvidia-smi, then layer Podman 5.x and the NVIDIA Container Toolkit’s Container Device Interface to obtain rootless GPU access. We next demonstrate pulling the lightweight vLLM inference image, mounting a locally cached TinyLlama model downloaded via the Hugging Face CLI, and exposing an OpenAI-compatible HTTP endpoint. Finally, we introduce GuideLLM, an automated load-generation tool that sweeps request rates, captures latency buckets, throughput ceilings, and token-per-second statistics, and writes structured JSON for downstream analysis. Live demos illustrate common pitfalls and give attendees troubleshooting checklists that transfer directly to any Red Hat-derived distribution. Participants will leave with a turnkey recipe they can adapt to larger models, multi-GPU nodes and a clear understanding of how configuration choices cascade into benchmark accuracy. No prior container, CUDA, or benchmarking experience is assumed. Attendees also receive sample scripts and links for immediate hands-on replication today.
In cloud-native environments, application performance often degrades due to contention over shared resources such as CPU caches and memory bandwidth. Current container technologies lack mechanisms to isolate these resources, which compels operators to maintain low utilization by scaling out their deployments.
This session explores strategies used by hyperscalers like Google and Alibaba Cloud to mitigate such performance interference. We will review their published methodologies, extracting key principles that could guide the development of a Kubernetes-native performance isolator. Participants will gain insights into the design trade-offs and operational impacts of these tools. Additionally, we will discuss integration strategies for deploying such isolators in existing Kubernetes environments, aiming to optimize resource utilization while preserving application performance.
After 7 and a half years, Podman has grown from a bare Git repository to a well-established giant. Join Matt Heon, an engineer who has worked on Podman since its creation, for a journey through its history and future. This talk will investigate the founding of the project, the growth of the community, major changes in the project's direction, the decision to join the CNCF, and what to expect from Podman 6 and beyond. Attendees will learn about growing a project and building a community, making difficult decisions - both technical and nontechnical - and how they can contribute to the future of Podman.
Take all of the documentation and throw it into the database and we're done, right? Not so fast.
Large language models (LLMs) don't know everything and retrieval-augmented generation, or RAG, fills in the knowledge gaps with just-in-time retrieval of data from a database. Technical limitations and challenges loom large here, but there are plenty of difficulties brought over from the world of humans into the world of LLMs.
Come on a RAG journey with me as I recount some of the roadblocks my team faced as we built a product with RAG as a core component. Learn about the available technologies, how to build out a RAG stack, and how to avoid bringing human complexity into an already complex technical system.
Monitoring has been a cornerstone of computing since its inception, but the landscape is rapidly evolving. As we navigate this transformation, what does the future hold for open source based application monitoring and observability? This session is intended for DevOps Engineers, System Administrators, and Software Developers eager to enhance their understanding of modern monitoring techniques.
In this talk, I will explore the current state of application monitoring and observability, emphasizing how open source software is at the forefront of innovation in this field. Attendees will gain practical knowledge of new and better ways to monitor and observe their applications and systems. These improved techniques are applicable to known tools such as Icinga2, Netdata, Prometheus, Zabbix, and Cacti.
I will demonstrate how modern monitoring and observability approaches offer enhanced capabilities in today's dynamic IT environments, unlike traditional methods. Using Icinga2 on a Linux server instance as a case study, you will learn how new automation tools, modern notification channels, and integrations with external platforms can create a robust, event-driven application monitoring framework that adapts to your needs .
While the primary focus will be on innovative open source solutions, I will also touch on the growing role of artificial intelligence in monitoring and observability. I will briefly discuss monitoring concepts such as check-less checks, anomaly detection and predictive analytics, highlighting how these advancements can complement traditional methods and enhance overall observability.
Attendees can expect to gain valuable insights into future trends in application monitoring and observability. You will learn about new tools and techniques that you can implement in your own projects immediately.
Join me for "Open Source Monitoring: Innovations Shaping Tomorrow" and take the next step in transforming your approach to application monitoring and observability. You will uncover how open source software innovations are paving the way for a more responsive and effective IT monitoring landscape.
What if the best open source contributions aren’t just about code, but about how people think? As an autistic engineering leader and strategist, I’ll share how systems thinking, pattern recognition, and sensory sensitivity shaped my contributions and leadership in tech.
This talk will cover:
How neurodivergent traits can enhance troubleshooting, architecture, and team communication
Common barriers faced by autistic contributors and practical solutions for open source communities that benefit everyone.
How to design inclusive environments
Who this talk is for:
This session is both personal and practical. It is targeted at project maintainers, engineering managers, and contributors who want to build more resilient, human-centered communities.
In many secure or industrial environments — like factories, labs, or embedded automotive systems — machines run in air-gapped or low-connectivity conditions. When systems fail, engineers often rely on scattered manuals or vendor documentation, which slows recovery. What if you could drop in a self-contained AI assistant that works offline — right at the edge?
In this session, we’ll show how to build a local GenAI troubleshooting assistant using Jetson Orin, Podman, and Ramalama, running entirely on-device. The LLM is containerized and served using ramalama, optimized for edge inference with quantized models like Mistral 7B (GGUF). It’s paired with a local vector database and a retrieval-augmented generation (RAG) pipeline to search logs, KBs, and internal docs — all without internet access.
We’ll deep-dive into:
Running LLMs efficiently on Jetson Orin with Ramalama + Podman
Using ramalama to orchestrate a multi-container RAG assistant
Indexing structured and unstructured ops data with FAISS
Designing an edge AI assistant for Fedora IoT or RHEL for Edge
Benchmarking performance and memory usage on-device
In automotive world Quality Management (QM) isn't limited to backend logic — it can be extended to integrate core system devices like audio, video, and virtualization (KVM and libkrun). In this talk, we'll explore how to leverage Podman Quadlets to bring these device capabilities to QM.
What happens when your Kubernetes cluster is pushed to its absolute limits? Does it gracefully scale to meet demand, or does it buckle under pressure? In this session, we’ll dive deep into Kube-burner, the only CNCF Sandbox project laser-focused on performance and scalability testing. Designed to push Kubernetes clusters to their breaking point, Kube-burner is the ultimate tool for validating scalability, and ensuring your infrastructure can handle real-world stress. Through real-world case studies, we’ll explore how Kube-burner’s trio of superpowers—benchmark orchestration, precise measurements, and rock-solid observability—can uncover performance regressions, validate cluster stability, and ensure your infrastructure is ready for the demands of modern workloads.
Whether you’re a platform engineer, SRE, or Kubernetes enthusiast, this session will equip you with the knowledge to use Kube-burner to set your Kubernetes benchmarks on fire—and ensure your clusters can handle the heat.
This talk will cover Dan's history at Red Hat, from working with Paul Cormier prior to Red Hat. Early days of SELinux, where did it come from. Early days of OpenShift. Working with Docker, Fallout with Docker, Container tools, bootc, all through to RamaLama.
Join us for the closing of the conference and for a chance to win some prizes by participating in a trivia!