2,000+ Free Courses with Certificates: Coding, AI, SQL, and More
Google, IBM & Meta Certificates — 40% Off for a Limited Time
Overview
Syllabus
00:00 Introduction to GRPO and Study Overview
00:51 Detailed Study Methodology
03:21 Supervised Fine Tuning SFT Explained
04:57 Odds Ratio Preference Optimization ORPO
07:00 Group Relative Policy Optimization GRPO
10:08 Implementation and Code Walkthrough
16:16 Training Data Creation and Optimization
19:22 Analyzing and Comparing Results
20:28 Setting Up and Running GRPO
27:19 Understanding Batch Sizes and Backpropagation
27:51 Setting Up the GRPO Trainer
28:31 Exploring Reward Functions
29:12 Densifying Rewards for Better Training
31:44 Implementing GRPO Training
33:59 Running Inference and Analyzing Results
35:33 Challenges and Considerations in GRPO
44:46 Comparing GRPO with Other Techniques
48:33 Practical Recommendations for Reinforcement Learning
54:58 Conclusion and Further Resources
Taught by
Trelis Research