Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This presentation from the MLOps community provides a structured overview of agentic system evaluation, addressing the limitations of standard evaluation methods for complex AI agents. Explore common single and multi-agent patterns, understand why rigorous evaluation is necessary, and learn core principles for meaningful assessment. Discover essential evaluation principles, methods (including benchmarks, simulation, and human feedback), and metrics for measuring agentic system performance while examining key challenges in the field. Presented by Aditya Gautam, a machine learning expert leading foundational integrity efforts for Llama models, who previously enhanced Facebook recommendation algorithms and has extensive experience across Google, startups, and various speaking engagements in the Generative AI community. This 28-minute talk is part of the bi-weekly "Agent Hour" event series hosted by MLOps.community.
Syllabus
Evaluation of Agentic System // Aditya Gautam // Agent Hour
Taught by
MLOps.community