Overview
Syllabus
[00:00] Challenges in Evaluating AI Agents
[04:57] Synthetic Data: Benefits and Challenges
[08:41] Simulation vs Evaluation with LLMs
[11:47] Red Teaming for System Testing
[16:26] Voice Agents and Text Core
[19:41] Automating Insight Discovery at Scale
[25:12] Guardrails and AI Simulations
[28:39] Training Models in Simulated Environments
[30:06] Snow Globe: Chat Simulation Tool
[34:05] AI Testing and Performance Criteria
[39:23] AI Agents and Self-Driving Inspiration
[41:36] Ensuring Graceful Self-Driving Failures
[43:52] AI Testing: Risks and Engagement
[47:00] Tool Configuration Testing Scenarios
Taught by
MLOps.community