Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This 29-minute video from Discover AI examines how Llama 4 Scout achieves its groundbreaking 10-million token context length through two innovative AI methods. Learn about the scaled softmax function developed by researchers at The University of Tokyo and the optimized layer configuration combining RoPE and NoPE layers with normalization from Cohere AI. Explore whether Llama 4 Scout can effectively reason across its extensive context window, analyzing the practical implications of this technical advancement. The video references Meta's publication "The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation" from April 2025, providing insights into the latest developments in AI research, particularly focusing on Meta's Llama 4 model capabilities and reasoning abilities.
Syllabus
Llama 4 Scout: 10M Token Context Length EXPLAINED
Taught by
Discover AI