Master Finance Tools - 35% Off CFI (Code CFI35)
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to accelerate distributed graph neural network (GNN) training through innovative in-network processing techniques in this 16-minute conference presentation from USENIX ATC '25. Discover the challenges of distributed GNN training systems, including memory limitations, redundant traffic, and bandwidth bottlenecks that occur when partitioning large graphs across multiple workers. Explore how traditional approaches suffer from complex dependencies among graph data and limited switch-aggregator resources that lead to performance degradation. Understand the proposed SwitchGNN solution that addresses these issues through coordinated in-network multicast and aggregation, featuring a graph-aware multicast reordering algorithm that prioritizes vertices with higher neighbor counts to reduce communication time. Examine the multi-level graph partitioning mechanism that prevents aggregator overflow by partitioning boundary vertices into independent blocks for batch processing while maintaining graph propagation correctness. Review the implementation details using P4 programmable switches and DPDK host stack, along with experimental results from real testbed and NS3 simulations demonstrating up to 74% reduction in training time through effective communication overhead reduction.
Syllabus
USENIX ATC '25 - Accelerating Distributed Graph Learning by Using Collaborative In-Network...
Taught by
USENIX