Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
MLOps World: Machine Learning in Production via YouTube
Future-Proof Your Career: AI Manager Masterclass
Google AI Professional Certificate - Learn AI Skills That Get You Hired
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore how reinforcement fine-tuning (RFT) can transform language models into more reliable decision engines for complex multi-turn agent workflows in this 30-minute talk by Sameer Reddy, Research Engineer at Predibase. Discover techniques for training small, specialized models (1B-3B parameters) that can make accurate tool selections and decisions without requiring hand-labeled data. Learn how this approach reduces both latency and cost while improving precision in agentic applications. The presentation covers practical implementations including deferring tool selection to compact RFT models, teaching chain-of-thought reasoning before decision-making, and building modular, low-latency components for existing agent stacks. Particularly valuable for ML engineers and infrastructure teams developing production-grade agents who need to optimize costs while maintaining reliability and control over model reasoning processes.
Syllabus
Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
Taught by
MLOps World: Machine Learning in Production