Evals Are Not Unit Tests - Evaluating Non-Deterministic AI Systems

Evals Are Not Unit Tests - Evaluating Non-Deterministic AI Systems

AI Engineer via YouTube Direct link

09:13 Structuring evals: constants in data, variables in task

7 of 10

7 of 10

09:13 Structuring evals: constants in data, variables in task

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Evals Are Not Unit Tests - Evaluating Non-Deterministic AI Systems

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00 Introduction to Vercel's V0 and its growth
  2. 2 01:00 The problem with AI unreliability
  3. 3 02:44 The "Fruit Letter Counter" app example of AI failure
  4. 4 03:33 Introducing "evals" and the basketball court analogy
  5. 5 05:09 Defining the "court": understanding the domain of user queries
  6. 6 07:53 Data collection for evals
  7. 7 09:13 Structuring evals: constants in data, variables in task
  8. 8 10:45 Scoring evals
  9. 9 12:35 Integrating evals into CI/CD
  10. 10 13:40 The benefits of using evals

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.