Completed
07:00 Group Relative Policy Optimization GRPO
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Comparing SFT and GRPO Methods for AI Model Fine-Tuning
Automatically move to the next video in the Classroom when playback concludes
- 1 00:00 Introduction to GRPO and Study Overview
- 2 00:51 Detailed Study Methodology
- 3 03:21 Supervised Fine Tuning SFT Explained
- 4 04:57 Odds Ratio Preference Optimization ORPO
- 5 07:00 Group Relative Policy Optimization GRPO
- 6 10:08 Implementation and Code Walkthrough
- 7 16:16 Training Data Creation and Optimization
- 8 19:22 Analyzing and Comparing Results
- 9 20:28 Setting Up and Running GRPO
- 10 27:19 Understanding Batch Sizes and Backpropagation
- 11 27:51 Setting Up the GRPO Trainer
- 12 28:31 Exploring Reward Functions
- 13 29:12 Densifying Rewards for Better Training
- 14 31:44 Implementing GRPO Training
- 15 33:59 Running Inference and Analyzing Results
- 16 35:33 Challenges and Considerations in GRPO
- 17 44:46 Comparing GRPO with Other Techniques
- 18 48:33 Practical Recommendations for Reinforcement Learning
- 19 54:58 Conclusion and Further Resources