Distributed Training: Hybrid Parallelism and Gradient Optimization - Lecture 20
MIT HAN Lab via YouTube
Google, IBM & Meta Certificates — 40% Off for a Limited Time
MIT Sloan AI Adoption: Build a Playbook That Drives Real Business ROI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn advanced distributed training concepts in this MIT lecture covering hybrid parallelism, auto-parallelization techniques, and strategies for overcoming bandwidth and latency bottlenecks in machine learning systems. Explore gradient compression methods including gradient pruning with sparse communication and deep gradient compression, as well as gradient quantization approaches like 1-Bit SGD and TernGrad. Understand how delayed gradient updates can address latency challenges in distributed training environments. Delivered by Professor Song Han as part of the MIT 6.5940 course, this 59-minute lecture provides essential knowledge for implementing efficient distributed machine learning systems.
Syllabus
EfficientML.ai Lecture 20 - Distributed Training Part 2 (MIT 6.5940, Fall 2024)
Taught by
MIT HAN Lab