DevConf.US 2025

KEERTHI UDAYAKUMAR

Keerthi is an AI aficionado passion-wise and Data Scientist professionally with a wide range of experience in building LLM applications with different models and researching new fine-tuning methods. I consider myself a Thinktank with industry-ready and domain specific skills to apply and innovate for the betterment of society. I also spoke in various other meetups, podcasts and technical events regarding LLMs and Data Science. My previous talk was in Bangalore, India Techtonic 2.0 event regarding LLM Security. Connect with me to collaborate on various AI projects and participate in Hackathons to build for the society.


Job title โ€“

ASSOCIATE DATA SCIENTIST

Company or affiliation โ€“

REDHAT


Sessions

09-19
10:20
15min
ZenZone: AI-Powered Peace of Mind
KEERTHI UDAYAKUMAR

The project aims to assess the feasibility and effectiveness of an AI-enabled chatbot for mental health detection, employing Large Language Models (LLM), Natural Language Processing (NLP), and Deep Learning models. The web application integrates social attributes to aid users with mental health concerns, offering self-assistance through personalized assessments. The core strategy centers on fostering an "Optimistic Presence" by deploying an AI-driven virtual assistant capable of empathic conversations, active listening, and emotional state analysis. The methodology involves emulating human mental health professionals, assessing conditions through various cues, and offering tailored therapeutic interventions for any stressed out individuals. Integration with health records using Azure PostgreSQL allows collaboration with human providers for comprehensive care. This innovative solution seeks to extend constant virtual AI therapy, revolutionizing mental health support with technology-driven personalized assistance for students, working professionals and many hidden victims of poor mental health.

Open Track
Hewitt Boardroom (Capacity 35)
09-20
09:20
15min
Smarter RAG, Smaller Bill: Optimize for Performance and Price
KEERTHI UDAYAKUMAR

RAG apps save up to 60% of the cost compared to standard LLMs. But in this talk, I will tell you a way that saves you more $$ on top of that because 2025 will all be about optimising the cost of building LLMs and its apps. RAGCache tackles these bottlenecks with cutting-edge techniques:
- ๐——๐˜†๐—ป๐—ฎ๐—บ๐—ถ๐—ฐ ๐—ž๐—ป๐—ผ๐˜„๐—น๐—ฒ๐—ฑ๐—ด๐—ฒ ๐—–๐—ฎ๐—ฐ๐—ต๐—ถ๐—ป๐—ด: Stores intermediate states in a structured knowledge tree, balancing GPU and host memory usage.
- ๐—˜๐—ณ๐—ณ๐—ถ๐—ฐ๐—ถ๐—ฒ๐—ป๐˜ ๐—ฅ๐—ฒ๐—ฝ๐—น๐—ฎ๐—ฐ๐—ฒ๐—บ๐—ฒ๐—ป๐˜ ๐—ฃ๐—ผ๐—น๐—ถ๐—ฐ๐˜†: Tailored for LLM inference and RAG retrieval patterns.
- ๐—ฆ๐—ฒ๐—ฎ๐—บ๐—น๐—ฒ๐˜€๐˜€ ๐—ข๐˜ƒ๐—ฒ๐—ฟ๐—น๐—ฎ๐—ฝ: Combines retrieval and inference to minimize latency.
Integrating RAGCache with tools like vLLM and Faiss delivers:
- ๐Ÿฐ๐˜… ๐—™๐—ฎ๐˜€๐˜๐—ฒ๐—ฟ Time to First Token (TTFT).
- ๐Ÿฎ.๐Ÿญ๐˜… ๐—ง๐—ต๐—ฟ๐—ผ๐˜‚๐—ด๐—ต๐—ฝ๐˜‚๐˜ ๐—•๐—ผ๐—ผ๐˜€๐˜, optimizing latency and computational efficiency.
The talk goes through:
1. Current challenges of RAG
2. A solution that reduces cost and improves user experience
3. How does it work?
4. How well does it perform?
5. What are the key benefits?
6. Lastly, a few real-world applications

Artificial Intelligence and Data Science
Ladd Room (Capacity 96)