Comparing SFT and GRPO Methods for AI Model Fine-Tuning

Comparing SFT and GRPO Methods for AI Model Fine-Tuning

Trelis Research via YouTube Direct link

35:33 Challenges and Considerations in GRPO

16 of 19

16 of 19

35:33 Challenges and Considerations in GRPO

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Comparing SFT and GRPO Methods for AI Model Fine-Tuning

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00 Introduction to GRPO and Study Overview
  2. 2 00:51 Detailed Study Methodology
  3. 3 03:21 Supervised Fine Tuning SFT Explained
  4. 4 04:57 Odds Ratio Preference Optimization ORPO
  5. 5 07:00 Group Relative Policy Optimization GRPO
  6. 6 10:08 Implementation and Code Walkthrough
  7. 7 16:16 Training Data Creation and Optimization
  8. 8 19:22 Analyzing and Comparing Results
  9. 9 20:28 Setting Up and Running GRPO
  10. 10 27:19 Understanding Batch Sizes and Backpropagation
  11. 11 27:51 Setting Up the GRPO Trainer
  12. 12 28:31 Exploring Reward Functions
  13. 13 29:12 Densifying Rewards for Better Training
  14. 14 31:44 Implementing GRPO Training
  15. 15 33:59 Running Inference and Analyzing Results
  16. 16 35:33 Challenges and Considerations in GRPO
  17. 17 44:46 Comparing GRPO with Other Techniques
  18. 18 48:33 Practical Recommendations for Reinforcement Learning
  19. 19 54:58 Conclusion and Further Resources

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.