LLM Evaluation Framework for Crafting Delightful Content from Messy Inputs
MLOps World: Machine Learning in Production via YouTube
AI, Data Science & Cloud Certificates from Google, IBM & Meta
AI Engineer - Learn how to integrate AI into software applications
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore an evaluation framework for assessing the quality of Large Language Model (LLM) outputs in transforming diverse and messy textual inputs into refined content. This 32-minute conference talk by Shin Liang, Senior Machine Learning Engineer at Canva, delves into the challenges of objectively evaluating LLM outcomes in subjective and unstructured tasks. Learn about general evaluation metrics like relevance, fluency, and coherence, as well as specific metrics such as information preservation rate, accuracy of title/heading understanding, and key information extraction scores. Discover how this framework can be applied to similar LLM tasks, providing valuable insights for crafting high-quality content from complex inputs.
Syllabus
LLM Evaluation to Craft Delightful Content From Messy Inputs
Taught by
MLOps World: Machine Learning in Production