Completed
Intro - 0:00
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Training LLMs to Think - Understanding o1 and DeepSeek-R1 Models
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro - 0:00
- 2 OpenAI's o1 - 0:33
- 3 Test-time Compute - 1:33
- 4 "Thinking" Tokens - 3:50
- 5 DeepSeek Paper - 5:58
- 6 Reinforcement Learning - 7:22
- 7 R1-Zero: Prompt Template - 9:28
- 8 R1-Zero: Reward - 10:53
- 9 R1-Zero: GRPO technical - 12:53
- 10 R1-Zero: Results - 20:00
- 11 DeepSeek R1 - 23:32
- 12 Step 1: SFT with CoT - 24:47
- 13 Step 2: R1-Zero Style RL - 26:14
- 14 Step 3: SFT with Mixed Data - 27:03
- 15 Step 4: RL & RLHF - 28:26
- 16 Accessing DeepSeek Models - 29:18
- 17 Conclusions - 30:10