Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build evaluation frameworks for generative AI applications using LLM-as-a-Judge methodology in this hands-on workshop. Develop practical skills in constructing automated evaluation systems from scratch using Weave, Weights & Biases' LLMOps tool, guided by the tool's developers. Explore the challenges and limitations of implementing LLM-as-a-Judge in real-world scenarios while discovering key considerations essential for practical deployment. Gain insights into best practices for evaluating generative AI applications and understand how to overcome common obstacles when using large language models as evaluation judges. Master the technical implementation details through practical exercises and learn to design robust evaluation pipelines that can be applied to production environments.
Syllabus
Fully Connected Tokyo: [Hands-on workshop] From 0 to automated evals
Taught by
Weights & Biases
Reviews
5.0 rating, based on 1 Class Central review
Showing Class Central Sort
-
"This hands-on workshop from Weights & Biases is excellent for anyone building LLM apps! It guides you from manual evals to fully automated frameworks using Weave and LLM-as-a-Judge. Practical, code-focused, with real-world tips on challenges like bias and consistency. Highly recommended for AI developers wanting reliable, scalable evaluations—great production insights!"