AI Product Expert Certification - Master Generative AI Skills
Get 35% Off CFI Certifications - Code CFI35
Overview
Syllabus
00:00 Introduction to GRPO and Study Overview
00:51 Detailed Study Methodology
03:21 Supervised Fine Tuning SFT Explained
04:57 Odds Ratio Preference Optimization ORPO
07:00 Group Relative Policy Optimization GRPO
10:08 Implementation and Code Walkthrough
16:16 Training Data Creation and Optimization
19:22 Analyzing and Comparing Results
20:28 Setting Up and Running GRPO
27:19 Understanding Batch Sizes and Backpropagation
27:51 Setting Up the GRPO Trainer
28:31 Exploring Reward Functions
29:12 Densifying Rewards for Better Training
31:44 Implementing GRPO Training
33:59 Running Inference and Analyzing Results
35:33 Challenges and Considerations in GRPO
44:46 Comparing GRPO with Other Techniques
48:33 Practical Recommendations for Reinforcement Learning
54:58 Conclusion and Further Resources
Taught by
Trelis Research