MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This video explores two groundbreaking AI research studies focusing on scaling reinforcement learning through structured cognitive behaviors and extended chain-of-thought reasoning. Dive into how researchers from Stanford University identified four key cognitive behaviors that enable self-improving reasoners, while teams from IN.AI, Tsinghua University, and Carnegie Mellon University work to demystify long chain-of-thought reasoning in Large Language Models. Learn how these complementary approaches create a comprehensive roadmap for developing AI systems that not only solve complex problems but can also explain their reasoning processes in both scientifically precise and intuitively accessible ways. The 34-minute presentation examines how a 3B parameter AI model can be enhanced through these techniques, offering valuable insights for anyone interested in the latest advancements in AI reasoning capabilities and reinforcement learning scaling methods.
Syllabus
Scaling RL: 3B AI w Long Chain-of-Thought & 4 Patterns
Taught by
Discover AI