Language Model Post-Training in 2025: an Overview of Customization Options Today DevConf.US 2025

Language Model Post-Training in 2025: an Overview of Customization Options Today
.ical

2025-09-20 13:50–14:25, Ladd Room (Capacity 170)

Join us for an overview of all the latest methods of language model post-training openly available today! We will begin with offline methods like standard Supervised Fine-Tuning (SFT), Parameter-Efficient Fine-Tuning (PEFT), Direct Preference Optimization (DPO), and continual learning techniques for further tuning existing instruct models. We will then move into online reinforcement learning options like Reinforcement Learning from Human Feedback (RLHF) and Group Relative Policy Optimization (GRPO). The talk will consist of a walkthrough of the use-cases for each method, as well as how to get started today via our very own Training Hub!

What level of experience should the audience have to best understand your session? –

Intermediate - attendees should be familiar with the subject

Language Model Post-Training in 2025: an Overview of Customization Options Today .ical 2025-09-20 13:50–14:25, Ladd Room (Capacity 170)

Language Model Post-Training in 2025: an Overview of Customization Options Today
.ical

2025-09-20 13:50–14:25, Ladd Room (Capacity 170)