Mitali Bhalla
Mitali Bhalla, Site Reliability Engineer -II at Red Hat, Inc
With 3+ years of experience managing and optimising complex Kubernetes and OpenShift clusters, I specialise in ensuring the reliability, scalability, and performance of infrastructure in dynamic environments. Passionate about site reliability engineering, I focus on automating processes, improving system uptime, and troubleshooting complex issues across large-scale cloud-native platforms.
Projects: github.com/MitaliBhalla
Red Hat, Inc
Job title –SRE - II
Sessions
As Kubernetes continues to evolve, Hypershift emerges as a revolutionary solution for managing clusters with exceptional scalability and efficiency. This presentation will explore the future of Kubernetes cluster management, highlighting how Hypershift is reshaping the multi-cloud landscape and transforming strategies for enterprise-grade operations. With the rapid growth of Kubernetes across industries, managing clusters across multiple cloud providers while ensuring high performance, security, and cost efficiency remains a critical challenge. Hypershift addresses this by offering a centralized control plane for managing multiple Kubernetes clusters, providing a streamlined and scalable approach to cluster management.
A key advancement within this ecosystem is Red Hat OpenShift Service on AWS (ROSA) with Hosted Control Planes (HCP). By separating control plane pods from the rest of the cluster and consolidating them within a single OpenShift management cluster, ROSA HCP enhances resilience, scalability, and operational efficiency. This architecture significantly reduces provisioning time (from 40 minutes in ROSA Classic to around 10 minutes with HCP), lowers infrastructure costs, and simplifies scaling by only requiring adjustments to worker nodes. In contrast, ROSA Classic often involves complex and costly scaling of control plane capacity.
ROSA HCP provides a fully managed cloud-native environment, empowering organizations to offload complex infrastructure management tasks such as security, compliance, and monitoring, while benefiting from the scalability and flexibility of Red Hat OpenShift on AWS. This innovative approach directly addresses customer feedback regarding cost, offering a more budget-friendly solution without compromising service quality.
During this presentation, we will demonstrate the power of ROSA HCP, walking through the prerequisites for cluster creation, cluster access, deploying and scaling a simple application, and showcasing how control plane management is handled by Red Hat. Join us to learn how Hypershift and ROSA HCP can revolutionize your Kubernetes cluster management and accelerate your cloud-native journey.
Kubernetes adoption is surging, with 96% of organizations using it in some capacity, according to the CNCF. Companies like Spotify, Airbnb, and Shopify operate dozens, if not hundreds, of Kubernetes clusters to support their global applications. But managing multiple clusters isn’t just a technical feat—it’s a logistical challenge. Consider this: A large enterprise managing 100 clusters could have tens of thousands of nodes and millions of pods. Each cluster generates a flood of metrics, logs, and alerts that must be coordinated to ensure high availability and performance. Managing multiple clusters introduces new levels of complexity that traditional tools like Terraform and Ansible weren’t designed to handle. While these tools are effective for provisioning infrastructure, they fall short in addressing day-2 operations such as policy enforcement, cluster upgrades, and unified monitoring across multiple environments. Similarly, GitOps pipelines streamline application deployment but provide limited visibility into the overall health and governance of multiple clusters. Teams are often left without a single-pane-of-glass solution for managing configuration drift, enforcing security policies, or gaining visibility into workloads across clusters.
Why does this problem persist? Because multi-cluster Kubernetes, while powerful, introduces inherent complexities. Networking between clusters can suffer from latency, causing out-of-sync application instances. Kubernetes’ built-in security tools only apply within single clusters, requiring manual replication to ensure uniform enforcement. Monitoring tools must be deployed individually in each cluster, often resulting in fragmented observability and disjointed data correlation.
While solutions like Cluster API, ArgoCD, and KCP offer partial relief, they lack the holistic approach needed for full multi-cluster lifecycle management. This is where Open Cluster Management (OCM) shines. OCM provides a unified framework for managing multiple Kubernetes clusters efficiently. The talk will feature a live demo showcasing how OCM Hub can seamlessly manage two Kubernetes clusters. We’ll demonstrate how OCM automates lifecycle tasks, such as policy enforcement, while providing a centralized platform for monitoring, governance, and workload distribution. By intelligently correlating data from multiple clusters, OCM simplifies troubleshooting, minimizes latency issues, and ensures consistency across environments.
In this session, we’ll demonstrate OCM’s ability to manage two Kubernetes clusters seamlessly through a live demo. You’ll see how it automates critical tasks such as upgrades and policy enforcement, ensuring smooth operation even across dozens of clusters. OCM’s centralized monitoring provides correlated insights that drastically reduce downtime and troubleshooting complexity.
Whether someone is operating in hybrid, multi-cloud, or edge environments, this session would help gain practical insights into leveraging OCM to reduce operational complexity, enhance resilience, and streamline Kubernetes operations at scale.