DevConf.US 2025

Language Model Post-Training in 2025: an Overview of Customization Options Today
2025-09-20 , Ladd Room (Capacity 96)

Join us for an overview of all the latest methods of language model post-training openly available today! We will begin with offline methods like standard Supervised Fine-Tuning (SFT), Parameter-Efficient Fine-Tuning (PEFT), Direct Preference Optimization (DPO), and continual learning techniques for further tuning existing instruct models. We will then move into online reinforcement learning options like Reinforcement Learning from Human Feedback (RLHF) and Group Relative Policy Optimization (GRPO). The talk will consist of a walkthrough of the use-cases for each method, as well as how to get started today via our very own Training Hub!


What level of experience should the audience have to best understand your session?

Intermediate - attendees should be familiar with the subject

A research engineer with a focus on language models. Currently working on model reasoning, efficiency, and customization. Previously worked on text-to-SQL translation, speech recognition, and distributed systems for AI/ML workloads.