Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Power BI Fundamentals - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a 15-minute conference presentation from OSDI '25 that introduces PipeThreader, a novel deep neural network (DNN) compiler designed to optimize execution on modern GPUs with heterogeneous specialized hardware units like TensorCores and Tensor Memory Accelerators. Learn how PipeThreader shifts scheduling functionality from hardware to software to enable more efficient and sophisticated computation pipelining with minimal manual effort through its innovative sTask-graph abstraction, hierarchical hardware abstraction that captures specialized unit capabilities, and new scheduling primitives. Discover how this approach achieves comparable or superior performance on well-studied DNN architectures like FlashAttention while uncovering novel pipeline schemes for emerging models like Mamba2 that deliver significantly better performance compared to state-of-the-art hand-crafted implementations. Gain insights from researchers at Peking University, Microsoft Research, Imperial College London, and Shanghai Jiao Tong University as they present their open-source solution that advances the field of DNN compilation and GPU utilization optimization.
Syllabus
OSDI '25 - PipeThreader: Software-Defined Pipelining for Efficient DNN Execution
Taught by
USENIX