BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//pretalx.devconf.info//devconf-us-2025//talk//UGHKK8
BEGIN:VTIMEZONE
TZID:EST
BEGIN:STANDARD
DTSTART:20001029T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10;UNTIL=20061029T070000Z
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
END:STANDARD
BEGIN:STANDARD
DTSTART:20071104T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=11
TZNAME:EST
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000402T030000
RRULE:FREQ=YEARLY;BYDAY=1SU;BYMONTH=4;UNTIL=20060402T080000Z
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
END:DAYLIGHT
BEGIN:DAYLIGHT
DTSTART:20070311T030000
RRULE:FREQ=YEARLY;BYDAY=2SU;BYMONTH=3
TZNAME:EDT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-devconf-us-2025-UGHKK8@pretalx.devconf.info
DTSTART;TZID=EST:20250920T092000
DTEND;TZID=EST:20250920T093500
DESCRIPTION:RAG apps save up to 60% of the cost compared to standard LLMs. 
 But in this talk\, I will tell you a way that saves you more $$ on top of 
 that because 2025 will all be about optimising the cost of building LLMs a
 nd its apps. RAGCache tackles these bottlenecks with cutting-edge techniqu
 es: \n- 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗞𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 
 𝗖𝗮𝗰𝗵𝗶𝗻𝗴: Stores intermediate states in a structured k
 nowledge tree\, balancing GPU and host memory usage.\n- 𝗘𝗳𝗳𝗶
 𝗰𝗶𝗲𝗻𝘁 𝗥𝗲𝗽𝗹𝗮𝗰𝗲𝗺𝗲𝗻𝘁 𝗣𝗼
 𝗹𝗶𝗰𝘆: Tailored for LLM inference and RAG retrieval patterns. \
 n- 𝗦𝗲𝗮𝗺𝗹𝗲𝘀𝘀 𝗢𝘃𝗲𝗿𝗹𝗮𝗽: Combines
  retrieval and inference to minimize latency. \nIntegrating RAGCache with 
 tools like vLLM and Faiss delivers: \n- 𝟰𝘅 𝗙𝗮𝘀𝘁𝗲𝗿 
 Time to First Token (TTFT). \n- 𝟮.𝟭𝘅 𝗧𝗵𝗿𝗼𝘂𝗴𝗵
 𝗽𝘂𝘁 𝗕𝗼𝗼𝘀𝘁\, optimizing latency and computational e
 fficiency. \nThe talk goes through:\n1. Current challenges of RAG\n2. A so
 lution that reduces cost and improves user experience\n3. How does it work
 ?\n4. How well does it perform?\n5. What are the key benefits?\n6. Lastly\
 , a few real-world applications
DTSTAMP:20260315T082823Z
LOCATION:Ladd Room (Capacity 170)
SUMMARY:Smarter RAG\, Smaller Bill: Optimize for Performance and Price - KE
 ERTHI UDAYAKUMAR
URL:https://pretalx.devconf.info/devconf-us-2025/talk/UGHKK8/
END:VEVENT
END:VCALENDAR
