Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Everything Hard About Building AI Agents Today

MLOps.community via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the complex challenges of building and evaluating AI agents in production through this 47-minute conference talk featuring Willem Pienaar (CTO of Cleric) and Shreya Shankar (PhD student in data management for machine learning). Delve into the fundamental problem of evaluating agents when "ground truth" is ambiguous and subjective user feedback proves insufficient for performance improvement. Learn about the three critical "gulfs" of human-AI interaction—Specification, Generalization, and Comprehension—and understand how they directly impact agent success rates. Discover strategies for moving humans "out of the loop" for feedback collection and creating faster learning cycles through implicit signals rather than manual review processes. Examine practical evaluation techniques including task failure analysis using heat maps and explore the trade-offs involved in using simulated environments for testing AI agents. Understand the reality of performance ceilings in AI systems and master the art of categorizing problems into three categories: what your agent can solve now, what it can learn to solve, and what it will likely never be able to solve. Gain insights into trust issues in AI data, cloud clarity meets retrieval systems, communication gap fixes, smarter feedback mechanisms for prompts, creative data exploration approaches, custom versus general AI considerations, agent skill enhancement, repeat failure detection, self-healing software concepts, and the complexities of monitoring AI systems in production environments.

Syllabus

[00:00] Trust Issues in AI Data
[04:49] Cloud Clarity Meets Retrieval
[09:37] Why Fast AI Is Hard
[11:10] Fixing AI Communication Gaps
[14:53] Smarter Feedback for Prompts
[19:23] Creativity Through Data Exploration
[23:46] Helping Engineers Solve Faster
[26:03] The Three Gaps in AI
[28:08] Alerts Without the Noise
[33:22] Custom vs General AI
[34:14] Sharpening Agent Skills
[40:01] Catching Repeat Failures
[43:38] Rise of Self-Healing Software
[44:12] The Chaos of Monitoring AI

Taught by

MLOps.community

Reviews

Start your review of Everything Hard About Building AI Agents Today

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.