Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
The Fastest Way to Become a Backend Developer Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to accelerate distributed graph neural network (GNN) training through innovative in-network processing techniques in this 16-minute conference presentation from USENIX ATC '25. Discover the challenges of distributed GNN training systems, including memory limitations, redundant traffic, and bandwidth bottlenecks that occur when partitioning large graphs across multiple workers. Explore how traditional approaches suffer from complex dependencies among graph data and limited switch-aggregator resources that lead to performance degradation. Understand the proposed SwitchGNN solution that addresses these issues through coordinated in-network multicast and aggregation, featuring a graph-aware multicast reordering algorithm that prioritizes vertices with higher neighbor counts to reduce communication time. Examine the multi-level graph partitioning mechanism that prevents aggregator overflow by partitioning boundary vertices into independent blocks for batch processing while maintaining graph propagation correctness. Review the implementation details using P4 programmable switches and DPDK host stack, along with experimental results from real testbed and NS3 simulations demonstrating up to 74% reduction in training time through effective communication overhead reduction.
Syllabus
USENIX ATC '25 - Accelerating Distributed Graph Learning by Using Collaborative In-Network...
Taught by
USENIX