Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How to Improve AI Apps with Automated Evals

Shaw Talebi via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to scale up the evaluation of AI applications through automated evaluation techniques in this comprehensive tutorial. Explore the challenges of evaluating open-ended LLM tasks that typically require human assessment and discover practical solutions using automated evals. Master the typical LLM workflow and understand common problems that arise when building AI applications. Dive deep into two distinct types of automated evaluations and their applications in real-world scenarios. Follow along with a detailed case study featuring an eval-driven LinkedIn Ghostwriter project that demonstrates the complete process from identifying failure modes to creating LLM judges. Gain hands-on experience with curating user inputs, generating content, applying evaluations, and refining results based on feedback. Access example code and references to implement these techniques in your own AI projects, and see a live demonstration of the automated evaluation system in action.

Syllabus

Introduction - 0:00
The Typical LLM Workflow - 0:21
The Problem - 1:11
Automed Evals - 1:50
2 Types of Automated Evals - 4:25
Example: Eval-driven LinkedIn Ghostwriter - 7:03
Step 1: Identify Failure Modes - 9:36
Step 2: Create LLM Judge - 10:49
Step 3: Curate User Inputs - 19:49
Step 4: Generate LI Posts - 20:30
Step 5: Apply Evals - 21:12
Step 6: Review Results and Refine - 22:06
The Results - 25:19
Demo - 26:59

Taught by

Shaw Talebi

Reviews

Start your review of How to Improve AI Apps with Automated Evals

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.