Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Get 20% off all career paths from fullstack to AI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to control unpredictable AI agent behavior through a systematic observability-driven evaluation framework in this 16-minute conference talk. Discover how LLM agents can drift into failure modes when prompts, retrieval systems, external data sources, and policies interact in unexpected ways. Explore a repeatable, metric-driven methodology for detecting problematic behaviors in production agentic systems, diagnosing the root causes of these issues, and implementing effective corrections at scale. Gain practical insights into monitoring and evaluating LLMs and AI agents to prevent costly failures and maintain reliable performance in real-world deployments.
Syllabus
Taming Rogue AI Agents with Observability-Driven Evaluation — Jim Bennett, Galileo
Taught by
AI Engineer