Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a comprehensive 42-minute video examining cutting-edge developments in AI post-training methodologies that unify Supervised Fine-Tuning (SFT) and Reinforcement Learning approaches. Delve into the latest LLM mechanics through the Alignment Engine framework while investigating critical research questions about mathematical reasoning's impact on general language model capabilities. Learn about transferability mechanisms in LLM reasoning through findings from Carnegie Mellon University, University of Pennsylvania, University of Washington, M-A-P, and The Hong Kong Polytechnic University researchers. Discover how implicit reward systems serve as bridges connecting SFT and DPO (Direct Preference Optimization) methodologies, based on collaborative research from Fudan University's School of Computer Science and Shanghai Artificial Intelligence Laboratory. Gain insights into advanced AI training frameworks, mathematical reasoning applications, and the theoretical foundations underlying modern language model alignment techniques.
Syllabus
New AI Framework: Post-Training
Taught by
Discover AI