Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Understanding Reasoning LLMs - o1, DeepSeek-R1, Gemini Thinking, Grok 3, Claude 3.7

Donato Capitella via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This video explores the recent wave of reasoning-focused large language models, including OpenAI's o1, Google's Gemini Thinking, DeepSeek's R1, xAI's Grok 3, and Anthropic's Claude 3.7. Dive into what reasoning models actually are, how they're trained, and why they represent an evolution rather than a revolution in AI capabilities. Learn about the four main approaches to building reasoning LLMs: inference-time scaling, pure reinforcement learning, supervised fine-tuning combined with reinforcement learning, and distillation. Discover why these models, despite their improved performance on structured problems, still operate as next-token predictors using transformer architecture like other LLMs. The presentation includes detailed explanations of training pipelines, with special focus on DeepSeek's R1 implementation, and concludes with an examination of the current limitations and challenges facing reasoning LLMs. A downloadable canvas/mindmap is provided to help visualize these concepts.

Syllabus

00:00 - Introduction
02:42 - What are reasoning models?
03:56 - The four approaches to building "reasoning" LLMs
04:31 - Inference-time scaling
06:46 - Standard LLM training pipeline
08:26 - Pure Reinforcement Learning DeepSeek R1-Zero
12:21 - Supervised Fine Tuning + Reinforcement Learning DeepSeek R1
17:20 - Summary of STF+RF approach DeepSeek R1
18:18 - Distillation
21:55 - Limitations and challenges of reasoning LLMs

Taught by

Donato Capitella

Reviews

Start your review of Understanding Reasoning LLMs - o1, DeepSeek-R1, Gemini Thinking, Grok 3, Claude 3.7

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.