Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Complete Beginner's Course on AI Evaluations in 50 Minutes - 2025

Peter Yang via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn AI evaluations through a hands-on tutorial where two product managers build evaluation systems from scratch for an AI customer support agent. Discover the four essential types of AI evaluations that every practitioner should understand, then follow along as the instructors demonstrate the complete process of creating effective evaluation frameworks. Master the fundamentals by watching the creation of evaluation criteria, learning to use Anthropic's console for prompt generation, and understanding how to add human labels to golden datasets. Explore advanced techniques for scaling evaluations using LLM-judge prompts and discover methods for aligning LLM judges with human judgment to ensure reliable assessment outcomes. Gain practical experience in building robust evaluation systems that can effectively measure AI performance in real-world applications.

Syllabus

00:00 What are AI evals and how to get good at them
02:52 The 4 types of AI evaluations everyone should know
06:08 Live demo: Building evals for a customer support agent
10:29 Using Anthropic's console to generate great prompts
15:13 Creating the evaluation criteria
17:40 Adding human labels to the golden dataset
31:05 Scaling evals with LLM-judge prompts
38:21 How to align LLM judges with human judgment

Taught by

Peter Yang

Reviews

Start your review of Complete Beginner's Course on AI Evaluations in 50 Minutes - 2025

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.