Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Beyond Gemini: Using Reinforcement Learning to Unlock Reliable AI Agents with Open LLMs

MLOps World: Machine Learning in Production via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
In this 33-minute conference talk from MLOps World, Julien Launay, CEO and co-founder of Adaptive ML, explores how reinforcement learning (RL) can create reliable AI agents using open-source large language models. Learn how EdTech organization Alloprof developed an AI student support agent superior to Khanmigo by embedding domain expertise through RL fine-tuning rather than prompt engineering alone. Discover how smaller, open-weight models fine-tuned with primarily synthetic data consistently outperformed state-of-the-art models including Gemini Coach. The presentation covers advanced techniques including dynamic retrieval-augmented generation (agentic RAG), adaptive communication strategies refined through iterative feedback, and synthetic data approaches like self-play that eliminate the need for extensive real-world data collection. As the former technical lead behind Falcon 40B and 180B LLMs and contributor to BLOOM, Launay shares practical insights on democratizing reinforcement fine-tuning for production AI systems.

Syllabus

Beyond Gemini: Using RL to unlock reliable AI agents with open LLMs

Taught by

MLOps World: Machine Learning in Production

Reviews

Start your review of Beyond Gemini: Using Reinforcement Learning to Unlock Reliable AI Agents with Open LLMs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.