Completed
00:00 Intros
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
AI Agent Evaluation - Methods and Best Practices for Measuring Agent Performance
Automatically move to the next video in the Classroom when playback concludes
- 1 00:00 Intros
- 2 02:56 Introduction to AI Agent Evaluation
- 3 09:06 5 Steps to Evaluation Intelligence
- 4 11:13 Components of An Agent
- 5 15:08 Integrated Observability
- 6 15:49 Play 1 - Create high & low-level metrics
- 7 28:12 Play 2 - Select your experimental infrastructure
- 8 32:31 Play 3 - Optimize your instructions
- 9 34:50 Play 4 - Optimize your retrieval
- 10 38:02 Play 5 - Add agentic tests to CI/CD
- 11 39:46 Play 6 - Build SLMS for real-time monitoring
- 12 43:46 Play 7 - Curate large-scale eval sets
- 13 46:21 Play 8 - Improve metric accuracy with/ human feedback
- 14 49:27 Building Your Evaluation System
- 15 51:23 Agent Leaderboard v2 launch
- 16 57:49 Agent Leaderboard v2 ranking
- 17 1:02:20 Cost vs Performance
- 18 1:05:56 Evaluation Best Practices
- 19 1:08:34 The Triad of Tradeoffs
- 20 1:10:18 Q&A