UC San Diego Product Management Certificate — AI-Powered PM Training
Get 20% off all career paths from fullstack to AI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore NVIDIA's Dynamo, a high-throughput low-latency inference framework designed for serving generative AI and reasoning models in multi-node distributed environments. Gain a comprehensive technical overview of Dynamo's architecture and understand how its design addresses the core challenges of large-scale, distributed generative AI inference as enterprise needs shift toward complex deployments. Walk through concrete deployment scenarios including disaggregated serving and dynamic GPU scheduling, while examining how Dynamo manages resource allocation, request routing, and memory efficiency for optimal performance. Learn practical implementation examples and discover engineering best practices for optimizing workload performance, scalability, and cost using Dynamo. Understand the steps and considerations for deploying Dynamo in production environments, including key architectural differences and compatibility factors. Master the deployment and operation of Dynamo to support advanced AI workloads in enterprise-scale distributed systems, with insights from NVIDIA experts on meeting the evolving demands of generative AI inference serving.
Syllabus
Dynamo: Supporting Next-Generation AI Workloads - Olga Andreeva & Ryan McCormick, NVIDIA
Taught by
Linux Foundation