The Fastest Way to Become a Backend Developer Online
Start speaking a new language. It’s just 3 weeks away.
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This conference talk explores the comparative effectiveness of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) in training reasoning capabilities for large vision-language models. Discover the hidden solutions and critical implications for AI reasoning in this 33-minute presentation from Discover AI. Learn why research findings on SFT+RL versus RL-only approaches have such a short validity window in the rapidly evolving AI landscape. The talk presents research from a collaborative team of scholars from the University of California Santa Cruz, University of Texas at Dallas, Pennsylvania State University, and Amazon Research, who authored "SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models."
Syllabus
TEXAS: Fine-Tuning Is for Cowards - Do RL
Taught by
Discover AI