AI Agent Evaluation - Methods and Best Practices for Measuring Agent Performance

AI Agent Evaluation - Methods and Best Practices for Measuring Agent Performance

Elvis Saravia via YouTube Direct link

00:00 Intros

1 of 20

1 of 20

00:00 Intros

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

AI Agent Evaluation - Methods and Best Practices for Measuring Agent Performance

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00 Intros
  2. 2 02:56 Introduction to AI Agent Evaluation
  3. 3 09:06 5 Steps to Evaluation Intelligence
  4. 4 11:13 Components of An Agent
  5. 5 15:08 Integrated Observability
  6. 6 15:49 Play 1 - Create high & low-level metrics
  7. 7 28:12 Play 2 - Select your experimental infrastructure
  8. 8 32:31 Play 3 - Optimize your instructions
  9. 9 34:50 Play 4 - Optimize your retrieval
  10. 10 38:02 Play 5 - Add agentic tests to CI/CD
  11. 11 39:46 Play 6 - Build SLMS for real-time monitoring
  12. 12 43:46 Play 7 - Curate large-scale eval sets
  13. 13 46:21 Play 8 - Improve metric accuracy with/ human feedback
  14. 14 49:27 Building Your Evaluation System
  15. 15 51:23 Agent Leaderboard v2 launch
  16. 16 57:49 Agent Leaderboard v2 ranking
  17. 17 1:02:20 Cost vs Performance
  18. 18 1:05:56 Evaluation Best Practices
  19. 19 1:08:34 The Triad of Tradeoffs
  20. 20 1:10:18 Q&A

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.