Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
MLOps World: Machine Learning in Production via YouTube
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Finance Certifications Goldman Sachs & Amazon Teams Trust
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore how reinforcement fine-tuning (RFT) can transform language models into more reliable decision engines for complex multi-turn agent workflows in this 30-minute talk by Sameer Reddy, Research Engineer at Predibase. Discover techniques for training small, specialized models (1B-3B parameters) that can make accurate tool selections and decisions without requiring hand-labeled data. Learn how this approach reduces both latency and cost while improving precision in agentic applications. The presentation covers practical implementations including deferring tool selection to compact RFT models, teaching chain-of-thought reasoning before decision-making, and building modular, low-latency components for existing agent stacks. Particularly valuable for ML engineers and infrastructure teams developing production-grade agents who need to optimize costs while maintaining reliability and control over model reasoning processes.
Syllabus
Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
Taught by
MLOps World: Machine Learning in Production