DevConf.US 2025

The Missing Metrics: Measuring Memory Noisy Neighbors in Cloud Native Environments
2025-09-19 , 101 (Capacity 48)

Competition for memory bandwidth and CPU caches between containers can increase application response times by 5x to 14x, even with CPU and memory limits in place. It can be triggered by common events like garbage collection, and existing observability tools do not collect the metrics to detect it. As it manifests as latency SLO violations, operators often scale out and run at low utilization: expensive, and only marginally improves response times.

CPU performance counters can detect memory interference. However since interference events are frequent and short-lived, detecting them requires high-frequency measurements, which is challenging due to jitter and overhead.

This session first presents the causes of memory noisy neighbors, real-world patterns that trigger it, and the benefits of mitigation. We then show how a new open source collector combines CPU performance counters, eBPF, and high-resolution timers to identify noisy neighbors in Kubernetes.


What level of experience should the audience have to best understand your session?

Beginner - no experience needed

Dr. Jonathan Perry is a maintainer of the OpenTelemetry Network Collector and CEO of Unvariance, which develops tools to detect and mitigate noisy neighbors. He received his PhD from MIT in mitigation of noisy neighbors in datacenter networks, then founded Flowmill, where he developed eBPF-based network monitoring tools prior to the company's acquisition by Splunk. He is based in Austin, Texas.

This speaker also appears in: