Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

TEXAS: Fine-Tuning Is for Cowards - Do RL

Discover AI via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk explores the comparative effectiveness of Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL) in training reasoning capabilities for large vision-language models. Discover the hidden solutions and critical implications for AI reasoning in this 33-minute presentation from Discover AI. Learn why research findings on SFT+RL versus RL-only approaches have such a short validity window in the rapidly evolving AI landscape. The talk presents research from a collaborative team of scholars from the University of California Santa Cruz, University of Texas at Dallas, Pennsylvania State University, and Amazon Research, who authored "SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models."

Syllabus

TEXAS: Fine-Tuning Is for Cowards - Do RL

Taught by

Discover AI

Reviews

Start your review of TEXAS: Fine-Tuning Is for Cowards - Do RL

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.