The Fastest Way to Become a Backend Developer Online
Lead AI-Native Products with Microsoft's Agentic AI Program
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn to build reliable evaluation metrics for AI applications in this comprehensive workshop led by a former Google Search Product Director. Discover how to design custom metrics that accurately measure performance in your specific use case, drawing from decades of AI development experience at Google Search and adapted for modern LLM applications. Master the process of brainstorming and designing tailored metrics for your application needs, then identify which types of signals—whether natural language, code, or other models—work best through rapid experimentation. Explore techniques for combining and calibrating metrics against ground truth data using real-world examples, while utilizing accessible tools like Google Sheets for visualization and analysis. Gain practical skills in integrating scoring models into both online workflows for agent control and offline processes for model comparison and training evaluation. The session provides actionable strategies for creating metrics that are highly accurate, fast, and tunable to ground truth rater and user behavior, essential for building trustworthy AI evaluations in production environments.
Syllabus
[Full Workshop] Building Metrics that actually work — David Karam, Pi Labs (fmr Google Search)
Taught by
AI Engineer