LLAMP - Assessing Network Latency Sensitivity Tolerance of HPC Applications with Linear Programming
Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
NY State-Licensed Certificates in Design, Coding & AI — Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about LLAMP, a novel analytical toolchain that uses linear programming to assess network latency sensitivity tolerance of HPC applications without requiring specialized hardware or time-consuming network simulators. Discover how this innovative approach leverages the LogGPS model to provide software developers and network architects with crucial insights for optimizing HPC infrastructures and strategically deploying applications to minimize latency impacts. Explore the methodology behind evaluating communication-intensive MPI applications' tolerance to network latency degradation, with validation results showing high accuracy and relative prediction errors generally below 2% across applications like MILC, LULESH, and LAMMPS. Examine a comprehensive case study of the ICON weather and climate model that demonstrates LLAMP's broad applicability in evaluating collective algorithms and network topologies. Understand the critical importance of network latency assessment as high-bandwidth networks driven by AI workloads in data centers and HPC clusters have unintentionally increased latency issues, and learn how this toolchain addresses the significant differences in latency tolerance exhibited by large-scale MPI applications.
Syllabus
00:00 Introduction
05:33 LLAMP Toolchain
10:20 Network Latency Sensitivity
14:47 Linear Programming
21:15 Evaluation
24:24 Conclusion
25:23 Q&A
Taught by
Scalable Parallel Computing Lab, SPCL @ ETH Zurich