Teaching AI to Reason: Reinforcement Fine-Tuning for Multi-Turn Agentic Workflows

Explore how reinforcement fine-tuning (RFT) can transform language models into more reliable decision engines for complex multi-turn agent workflows in this 30-minute talk by Sameer Reddy, Research Engineer at Predibase. Discover techniques for training small, specialized models (1B-3B parameters) that can make accurate tool selections and decisions without requiring hand-labeled data. Learn how this approach reduces both latency and cost while improving precision in agentic applications. The presentation covers practical implementations including deferring tool selection to compact RFT models, teaching chain-of-thought reasoning before decision-making, and building modular, low-latency components for existing agent stacks. Particularly valuable for ML engineers and infrastructure teams developing production-grade agents who need to optimize costs while maintaining reliability and control over model reasoning processes.