Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Dynamo - Supporting Next-Generation AI Workloads

Linux Foundation via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore NVIDIA's Dynamo, a high-throughput low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments. Gain a comprehensive technical overview of Dynamo's architecture and understand how its design addresses the core challenges of large-scale, distributed generative AI inference as enterprise needs shift toward complex deployments. Walk through concrete deployment scenarios including disaggregated serving and dynamic GPU scheduling, while examining how Dynamo manages resource allocation, request routing, and memory efficiency for optimal performance. Learn practical implementation examples and discover engineering best practices for optimizing workload performance, scalability, and cost using Dynamo. Understand the steps and considerations for deploying Dynamo in production environments, including key architectural differences and compatibility factors. Master the deployment and operation of Dynamo to support advanced AI workloads in enterprise-scale distributed systems, with insights from NVIDIA experts on meeting the evolving demands of generative AI inference serving.

Syllabus

Dynamo: Supporting Next-Generation AI Workloads - Olga Andreeva & Ryan McCormick, NVIDIA

Taught by

Linux Foundation

Reviews

Start your review of Dynamo - Supporting Next-Generation AI Workloads

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.