Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Tuning LLM Judge Design Decisions for 1/1000 of the Cost

AutoML Seminars via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to systematically optimize Large Language Model (LLM) judges for evaluating model outputs at dramatically reduced costs through this 42-minute AutoML Seminars presentation. Discover the challenges of expensive human annotations in LLM evaluation and explore how LLM-based judges can rank models without human intervention by comparing outputs between different LLMs. Examine the confounding factors that make fair comparisons difficult across different research papers, including variations in models, prompts, and hyperparameters that are often changed simultaneously. Master a systematic approach to analyzing and tuning LLM judge hyperparameters using multi-objective multi-fidelity optimization techniques that balance accuracy against computational cost while significantly reducing search expenses. Understand how this methodology identifies judges that outperform existing benchmarks in both accuracy and cost-efficiency while utilizing open-weight models for enhanced accessibility and reproducibility. Access the accompanying research paper and implementation code to apply these cost-effective evaluation strategies in your own LLM projects and research.

Syllabus

Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Taught by

AutoML Seminars

Reviews

Start your review of Tuning LLM Judge Design Decisions for 1/1000 of the Cost

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.