Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

DeepSeek R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

AI Bites via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a detailed video explanation of the DeepSeek R1 research paper, focusing on how reinforcement learning can enhance reasoning capabilities in Large Language Models (LLMs). Learn about the groundbreaking approach that eliminates the need for Supervised Fine-Tuning (SFT) in LLM training, making DeepSeek R1 the first model to achieve this milestone. Dive into comprehensive coverage of the model architecture, training methodology including Group Relative Policy Optimization, reward modeling techniques, and performance metrics. Understand the self-evolution process and examine the practical results that demonstrate the effectiveness of this innovative training approach. Access additional resources including the official DeepSeek platform, API documentation, and related research papers to further expand your knowledge of this advancement in AI development.

Syllabus

0:00 - Intro
2:38 - Training LLMs
5:05 - DeepSeek R1 Zero Training
5:54 - Group Relative Policy Optimization
8:45 - Reward Modelling
10:21 - Training Performance
11:33 - Self-evolution
17:20 - Results

Taught by

AI Bites

Reviews

Start your review of DeepSeek R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.