Language Model Post-Training in 2025 - An Overview of Customization Options Today

Explore comprehensive language model post-training methodologies available in 2025 through this 36-minute conference talk from DevConf.US. Discover offline training approaches including Supervised Fine-Tuning (SFT), Parameter-Efficient Fine-Tuning (PEFT), Direct Preference Optimization (DPO), and continual learning techniques for enhancing existing instruction-following models. Learn about online reinforcement learning methods such as Reinforcement Learning from Human Feedback (RLHF) and Group Relative Policy Optimization (GRPO). Understand the specific use cases for each post-training method and gain practical guidance on implementing these techniques using the Training Hub platform. Master the latest customization options for adapting language models to specific requirements and applications.