Sharan Harsoor
Principal Machine Learning Engineer at Red Hat's Data & AI team in Bangalore, building LLM-based systems, RAG pipelines, and agentic workflows. With 11+ years spanning AI, ML, and data engineering, he's passionate about sharing real-world architectures that actually work in production.
Session
Enterprise RAG systems fail not because of LLM limitations, but due to a critical overlooked foundation: content segmentation. Organizations invest heavily in sophisticated retrieval architectures while using naive character-count splitting that destroys semantic coherence. Contract clauses severed mid-sentence, code functions fragmented, medical narratives broken apart; these segmentation failures cause hallucinations, inconsistent responses, and lost user trust.
This session demonstrates why intelligent content segmentation has emerged as a critical engineering discipline for production AI systems. Through live demonstrations, we compare the same enterprise knowledge base processed with naive splitting versus semantic-aware segmentation, measuring the impact on retrieval accuracy (40-60% improvement), hallucination rates, and query success.
We present production-ready architectural patterns that attendees can implement immediately: semantic-aware splitting that preserves document structure and domain logic, streaming pipelines for processing large files that exceed RAM capacity, adaptive optimization through retrieval feedback loops, and multimodal handling across text, code, and structured documents.
To prove these patterns work at scale, we've released an open-source implementation available on GitHub and PyPI (pip install chunking-strategy). The codebase demonstrates thread-safe parallel processing, comprehensive error handling, and clean abstractions teams can customize for their domains no vendor lock-in, just production-quality code you can own and extend.