Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

MPI Meets Machine Learning - Unlocking PyTorch Distributed for Scaling AI Workloads

DevConf via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the intersection of High-Performance Computing and modern deep learning through this conference talk that bridges traditional HPC paradigms with PyTorch's distributed computing ecosystem. Discover how familiar HPC concepts like collective operations, point-to-point communication, and process groups manifest in PyTorch's distributed APIs, and learn how PyTorch builds upon battle-tested communication backends including NCCL, Gloo, and MPI while introducing novel primitives optimized for gradient synchronization and model parallelism. Move beyond basic data parallelism to examine advanced memory-saving techniques like Fully Sharded Data Parallel (FSDP), PyTorch's native solution for memory scaling, and explore the emerging Tensor and Pipeline Parallelism APIs that demonstrate how these techniques compose to train massive models. Gain comprehensive understanding of PyTorch's distributed architecture and the inner workings of one of the most actively developed areas in modern ML infrastructure, while mapping distributed systems concepts to PyTorch's implementation to see how familiar patterns from parallel computing manifest in PyTorch's ecosystem and identify areas for innovation and improvement.

Syllabus

MPI Meets Machine Learning: Unlocking PyTorch distributed for scaling AI workloads - DevConf.IN 2026

Taught by

DevConf

Reviews

Start your review of MPI Meets Machine Learning - Unlocking PyTorch Distributed for Scaling AI Workloads

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.