Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Inference Scaling: A New Frontier for AI Capability

Simons Institute via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This lecture by Azalia Mirhoseini from Stanford/DeepMind explores inference compute as an emerging frontier for scaling Large Language Models (LLMs). Discover how "Large Language Monkeys" research demonstrates a predictable log-linear relationship between coverage (problems solved) and the number of inference samples across four orders of magnitude, suggesting the existence of inference-time scaling laws. Learn how these coverage increases translate to improved performance in domains with automatic verification like coding and formal proofs, while identifying correct samples without verifiers remains challenging. Explore the Archon framework, which automatically designs effective inference-time systems by selecting, combining, and stacking operations like repeated sampling, fusion, ranking, and verification to optimize LLM performance across diverse tasks. The talk concludes with hardware acceleration techniques to improve computational efficiency in LLM serving.

Syllabus

Inference Scaling: A New Frontier for AI Capability

Taught by

Simons Institute

Reviews

Start your review of Inference Scaling: A New Frontier for AI Capability

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.