Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
AI Adoption - Drive Business Value and Organizational Impact
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a detailed video explanation of the DeepSeek R1 research paper, focusing on how reinforcement learning can enhance reasoning capabilities in Large Language Models (LLMs). Learn about the groundbreaking approach that eliminates the need for Supervised Fine-Tuning (SFT) in LLM training, making DeepSeek R1 the first model to achieve this milestone. Dive into comprehensive coverage of the model architecture, training methodology including Group Relative Policy Optimization, reward modeling techniques, and performance metrics. Understand the self-evolution process and examine the practical results that demonstrate the effectiveness of this innovative training approach. Access additional resources including the official DeepSeek platform, API documentation, and related research papers to further expand your knowledge of this advancement in AI development.
Syllabus
0:00 - Intro
2:38 - Training LLMs
5:05 - DeepSeek R1 Zero Training
5:54 - Group Relative Policy Optimization
8:45 - Reward Modelling
10:21 - Training Performance
11:33 - Self-evolution
17:20 - Results
Taught by
AI Bites