Next-Generation Networks for Machine Learning
Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube
Learn Python with Generative AI - Self Paced Online
MIT Sloan AI Adoption: Build a Playbook That Drives Real Business ROI
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore cutting-edge techniques for accelerating distributed deep neural network (DNN) training in this 50-minute conference talk by Manya Ghobadi at SPCL_Bcast. Delve into the challenges posed by increasing dataset and model sizes, and discover innovative solutions to overcome network bottlenecks in datacenter environments. Learn about a novel optical fabric that optimizes network topology and parallelization strategies for DNN clusters. Examine the limitations of fair-sharing in congestion control algorithms and understand a new scheduling approach that strategically places jobs on network links to enhance performance. Gain insights into the future of machine learning infrastructure and network design for improved training efficiency.
Syllabus
Introduction
Talk
Announcements
Taught by
Scalable Parallel Computing Lab, SPCL @ ETH Zurich