DevConf.CZ 2025

Liat Pele

Liat Pele is a Development Team Leader at RedHat, working on troubleshooting and validation tools for cloud infrastructure. She has been involved in developing system validation and automation tools to support SRE group.

She previously contributed to research initiatives like the Horizon2020 NGPaaS project and holds a Ph.D. in Computational Chemistry from the Hebrew University of Jerusalem.

At DevConf.CZ 2025, she will share insights on using machine learning and LLMs to improve automated cluster validation and reduce false positives.


Company or affiliation

RedHat

Job title

Engineering Manager


Session

06-12
14:00
35min
Predicting Faulty Validations in Cluster Issue Detection: A Machine Learning Approach
Liat Pele, Ofir Pele

Our team maintains a large-scale codebase for detecting and predicting issues in clusters, with hundreds of validation rules contributed and regularly updated by SRE engineers. A key challenge is identifying false positives and predicting which validations are most likely to require fixes.

In this research, we analyze validation rules as repeated code patterns, creating a unique dataset for machine learning. We compute numerical descriptors—such as code length, complexity, entropy, and time since introduction—across different Git branches and compare them with historical bug fixes. Preliminary results indicate strong correlations between these factors and validation reliability.

In this talk, we will present our findings using classical machine learning models and benchmark them against modern large language models (LLMs). We will discuss the effectiveness of both approaches, and the potential impact on automated validation quality improvement.

Artificial Intelligence and Data Science
D105 (capacity 300)