All Coursera Certificates 40% Off
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about DShuffle, an innovative framework that leverages Data Processing Units (DPUs) to optimize shuffle operations in large-scale distributed data processing systems through this 19-minute conference presentation from USENIX ATC '25. Discover how researchers from Wuhan National Laboratory for Optoelectronics and Huawei Cloud address the critical performance bottleneck of shuffle operations, which are responsible for transferring intermediate data between nodes in distributed computing environments. Explore the technical architecture of DShuffle, which divides shuffle processes into three pipelined stages: serialization, preprocessing, and I/O operations, specifically designed to harness DPU capabilities effectively. Understand how the framework utilizes high-concurrency memory access units to accelerate serialization phases and enables DPUs to directly write intermediate data to disk, eliminating unnecessary data copies and reducing CPU overhead. Examine experimental results from real DPU platform testing with industrial-grade Apache Spark that demonstrate significant improvements in both host CPU efficiency and I/O performance, leading to reduced task completion times in data analysis workloads involving large datasets.
Syllabus
USENIX ATC '25 - DShuffle: DPU-Optimized Shuffle Framework for Large-scale Data Processing
Taught by
USENIX