Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
MLOps World: Machine Learning in Production via YouTube
Lead AI-Native Products with Microsoft's Agentic AI Program
Learn Generative AI, Prompt Engineering, and LLMs for Free
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore how reinforcement fine-tuning (RFT) can transform language models into more reliable decision engines for complex multi-turn agent workflows in this 30-minute talk by Sameer Reddy, Research Engineer at Predibase. Discover techniques for training small, specialized models (1B-3B parameters) that can make accurate tool selections and decisions without requiring hand-labeled data. Learn how this approach reduces both latency and cost while improving precision in agentic applications. The presentation covers practical implementations including deferring tool selection to compact RFT models, teaching chain-of-thought reasoning before decision-making, and building modular, low-latency components for existing agent stacks. Particularly valuable for ML engineers and infrastructure teams developing production-grade agents who need to optimize costs while maintaining reliability and control over model reasoning processes.
Syllabus
Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows
Taught by
MLOps World: Machine Learning in Production