Performance Evaluation of Open-Source Instruction-Tuned Large Language Models
Discover AI via YouTube
Master Agentic AI, GANs, Fine-Tuning & LLM Apps
Start speaking a new language. It’s just 3 weeks away.
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn about the latest developments in open-source instruction-tuned Large Language Models (LLMs) in this comprehensive video presentation that analyzes performance benchmarks and evaluation methodologies. Explore key findings from a recent arXiv pre-print titled "INSTRUCTEVAL" which provides a holistic evaluation framework for instruction-tuned LLMs. Compare results across three major leaderboards from Stanford's HELM, HuggingFace, and LMsys to understand how different open-source models perform. Delve into topics including evaluation data, problem-solving capabilities, human values alignment, and practical implications for AI development. Gain insights into benchmark methodologies and discover which open-source LLMs are currently leading in performance across various metrics and use cases.
Syllabus
Introduction
Evaluation Data
Problem Solving
Main Message
Human Values
Conclusion
Bonus
Helm
Taught by
Discover AI