Completed
08:26 - Pure Reinforcement Learning DeepSeek R1-Zero
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Understanding Reasoning LLMs - o1, DeepSeek-R1, Gemini Thinking, Grok 3, Claude 3.7
Automatically move to the next video in the Classroom when playback concludes
- 1 00:00 - Introduction
- 2 02:42 - What are reasoning models?
- 3 03:56 - The four approaches to building "reasoning" LLMs
- 4 04:31 - Inference-time scaling
- 5 06:46 - Standard LLM training pipeline
- 6 08:26 - Pure Reinforcement Learning DeepSeek R1-Zero
- 7 12:21 - Supervised Fine Tuning + Reinforcement Learning DeepSeek R1
- 8 17:20 - Summary of STF+RF approach DeepSeek R1
- 9 18:18 - Distillation
- 10 21:55 - Limitations and challenges of reasoning LLMs