Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

The Evolution of LLM Evaluation and Japan's Cutting-Edge Benchmarks on the Nejumi Leaderboard

Weights & Biases via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore the development and evolution of large language model evaluation through Weights & Biases' comprehensive Nejumi LLM Leaderboard in this 20-minute conference talk. Discover how W&B has conducted systematic performance evaluations of LLMs since 2023, continuously publishing results that have become Japan's largest evaluation platform and a key reference for researchers and companies. Learn about the iterative development process from the initial version through the latest version 4, understanding how the leaderboard has adapted to advancements in evaluation techniques and model design. Gain insights from actual operational experience and explore future prospects for LLM evaluation methodologies and benchmarking standards in the rapidly evolving field of artificial intelligence.

Syllabus

The evolution of LLM evaluation and Japan’s cutting-edge benchmarks on the Nejumi leaderboard

Taught by

Weights & Biases

Reviews

Start your review of The Evolution of LLM Evaluation and Japan's Cutting-Edge Benchmarks on the Nejumi Leaderboard

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.