Reinforcement Learning with Verifiable Rewards - RLVR Environments for LLMs

Reinforcement Learning with Verifiable Rewards - RLVR Environments for LLMs

Yacine Mahdid via YouTube Direct link

00:00 - Introduction: RL’s growing role in agentic AI

1 of 24

1 of 24

00:00 - Introduction: RL’s growing role in agentic AI

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Reinforcement Learning with Verifiable Rewards - RLVR Environments for LLMs

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00 - Introduction: RL’s growing role in agentic AI
  2. 2 01:10 - The RLVR loop: dataset, policy, rollouts, rewards, updates
  3. 3 02:13 - Overview of the state of RLVR
  4. 4 03:50 - Small-model RLVR: performance, latency, and cost benefits
  5. 5 06:00 - RLVR vs RLHF: key conceptual differences
  6. 6 07:32 - Open-source frameworks: ReasoningGym, ART, TRL and Verifiers
  7. 7 08:12 - deep dive into the verifiers 7 steps with math-python env
  8. 8 08:25 - deep dive into the verifiers | step 1 : data
  9. 9 09:09 - deep dive into the verifiers | step 2 : interaction style
  10. 10 09:40 - deep dive into the verifiers | step 3 : environment logic
  11. 11 10:05 - deep dive into the verifiers | step 4 : rewards function rubric
  12. 12 11:23 - deep dive into the verifiers | step 5 : parser optional
  13. 13 11:46 - deep dive into the verifiers | step 6 : package environment
  14. 14 12:07 - deep dive into the verifiers | step 7 : run eval or training
  15. 15 12:30 - a few community environments
  16. 16 13:25 - Case study: Building a Vision-Language RLVR environment feat alexine
  17. 17 13:56 - vision SR1 - overview
  18. 18 16:46 - vision SR1 - environment 1
  19. 19 18:29 - vision SR1 - environment 2
  20. 20 20:03 - Interview with prime Will Brown, creator of Verifiers
  21. 21 20:18 - Interview with prime Will Brown - verifiers development story
  22. 22 23:16 - Interview with prime Will Brown - what's the vision for environment hub?
  23. 23 24:17 - Interview with prime Will Brown - what future is there for RL environment?
  24. 24 26:27 - 👺🦋👺🦋👺🦋

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.