Google AI Professional Certificate - Learn AI Skills That Get You Hired
Learn Python with Generative AI - Self Paced Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the intersection of High-Performance Computing and modern deep learning through this conference talk that bridges traditional HPC paradigms with PyTorch's distributed computing ecosystem. Discover how familiar HPC concepts like collective operations, point-to-point communication, and process groups manifest in PyTorch's distributed APIs, and learn how PyTorch builds upon battle-tested communication backends including NCCL, Gloo, and MPI while introducing novel primitives optimized for gradient synchronization and model parallelism. Move beyond basic data parallelism to examine advanced memory-saving techniques like Fully Sharded Data Parallel (FSDP), PyTorch's native solution for memory scaling, and explore the emerging Tensor and Pipeline Parallelism APIs that demonstrate how these techniques compose to train massive models. Gain comprehensive understanding of PyTorch's distributed architecture and the inner workings of one of the most actively developed areas in modern ML infrastructure, while mapping distributed systems concepts to PyTorch's implementation to see how familiar patterns from parallel computing manifest in PyTorch's ecosystem and identify areas for innovation and improvement.
Syllabus
MPI Meets Machine Learning: Unlocking PyTorch distributed for scaling AI workloads - DevConf.IN 2026
Taught by
DevConf