Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Who Let the Bots Out? A Guide to Evaluating AI Agents

Linux Foundation via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to systematically evaluate AI agents through a comprehensive open source framework that breaks down performance assessment into three critical dimensions. Explore tool use evaluation by examining each step from tool selection and parameter capture to execution, ensuring individual components operate correctly. Understand trajectory evaluation techniques that scrutinize an agent's overall workflow to verify adherence to optimal and efficient action sequences. Master goal evaluation strategies to quantitatively determine whether agents achieve specified outcomes. Discover how this methodology identifies failure points across evaluation dimensions while providing actionable insights for iterative improvements. Gain a robust, reproducible approach to benchmark and optimize AI agents, effectively bridging the gap between experimental development and reliable production deployment of LLM-based systems that manage complex, multi-step tasks.

Syllabus

Who Let the Bots Out? A Guide to Evaluating AI Agents - James Cha-Earley, Snowflake

Taught by

Linux Foundation

Reviews

Start your review of Who Let the Bots Out? A Guide to Evaluating AI Agents

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.