Completed
00:00 - Introduction: RL’s growing role in agentic AI
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Reinforcement Learning with Verifiable Rewards - RLVR Environments for LLMs
Automatically move to the next video in the Classroom when playback concludes
- 1 00:00 - Introduction: RL’s growing role in agentic AI
- 2 01:10 - The RLVR loop: dataset, policy, rollouts, rewards, updates
- 3 02:13 - Overview of the state of RLVR
- 4 03:50 - Small-model RLVR: performance, latency, and cost benefits
- 5 06:00 - RLVR vs RLHF: key conceptual differences
- 6 07:32 - Open-source frameworks: ReasoningGym, ART, TRL and Verifiers
- 7 08:12 - deep dive into the verifiers 7 steps with math-python env
- 8 08:25 - deep dive into the verifiers | step 1 : data
- 9 09:09 - deep dive into the verifiers | step 2 : interaction style
- 10 09:40 - deep dive into the verifiers | step 3 : environment logic
- 11 10:05 - deep dive into the verifiers | step 4 : rewards function rubric
- 12 11:23 - deep dive into the verifiers | step 5 : parser optional
- 13 11:46 - deep dive into the verifiers | step 6 : package environment
- 14 12:07 - deep dive into the verifiers | step 7 : run eval or training
- 15 12:30 - a few community environments
- 16 13:25 - Case study: Building a Vision-Language RLVR environment feat alexine
- 17 13:56 - vision SR1 - overview
- 18 16:46 - vision SR1 - environment 1
- 19 18:29 - vision SR1 - environment 2
- 20 20:03 - Interview with prime Will Brown, creator of Verifiers
- 21 20:18 - Interview with prime Will Brown - verifiers development story
- 22 23:16 - Interview with prime Will Brown - what's the vision for environment hub?
- 23 24:17 - Interview with prime Will Brown - what future is there for RL environment?
- 24 26:27 - 👺🦋👺🦋👺🦋