Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Llama-Nemotron - Efficient Open Reasoning Models

MLOps World: Machine Learning in Production via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Llama-Nemotron, an open-source family of reasoning models that delivers state-of-the-art reasoning capabilities with industry-leading inference efficiency in this 30-minute conference talk. Discover how these models, available in three sizes—Nano (8B), Super (49B), and Ultra (253B)—surpass existing open reasoning models such as DeepSeek-R1 while offering substantial improvements in inference throughput and memory efficiency. Explore the specialized training methodology underlying these models, including a two-stage post-training pipeline that combines supervised fine-tuning (SFT) using carefully curated synthetic datasets to effectively distill advanced reasoning behaviors, and large-scale reinforcement learning (RL) with curriculum-driven self-learning to enable models to exceed teacher performance. Examine key innovations such as neural architecture search (NAS) for enhanced model efficiency, targeted inference-time optimizations, and a dynamic toggle for switching reasoning on or off, with emphasis on their practical importance in real-world enterprise deployments. Gain insights from NVIDIA Research Scientist Soumye Singhal, who specializes in LLM post-training and alignment for Nemotron models and has contributed to the development of both Llama-Nemotron reasoning models and Nemotron-Hybrid models.

Syllabus

Llama-Nemotron: Efficient Open Reasoning Models

Taught by

MLOps World: Machine Learning in Production

Reviews

Start your review of Llama-Nemotron - Efficient Open Reasoning Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.