AI Product Expert Certification - Master Generative AI Skills
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about DeepSeek's innovative learning method "Self-Principled Critique Tuning" (SPCT) and their new reasoning model DeepSeek-GRM-27B in this 19-minute explanatory video. Discover how this breakthrough approach works and why it might form the foundation for the next DeepSeek R2. The video covers the research from "Inference-Time Scaling for Generalist Reward Modeling" by researchers from DeepSeek-AI and Tsinghua University, providing insights into the future of AI reasoning models and reward systems.
Syllabus
NEW by DeepSeek: SPCT w/ DeepSeek-GRM-27B
Taught by
Discover AI