Ladder: Enabling Efficient Low-Precision Deep Learning Computing through Hardware-aware Tensor Transformation

Explore a groundbreaking compiler called Ladder in this 16-minute conference talk from OSDI '24. Dive into the world of efficient low-precision deep learning computing through hardware-aware tensor transformation. Learn how Ladder bridges the gap between evolving custom data types and fixed precision formats supported by current hardware. Discover the general type system tType and extended tensor expression that enable Ladder to transform deep neural network computations into optimized computing pipelines. Understand how Ladder employs new tensor scheduling primitives and a hardware-aware optimization policy to navigate complex transformation spaces, ensuring optimal performance across different memory layers and DNN operators. Gain insights into Ladder's capability to systematically support a wide array of low-bit precision custom data types, significantly enhancing DNN computation performance on modern accelerators without hardware modifications. See how this innovation empowers model designers to explore data type optimizations and provides hardware vendors with a flexible solution to expand support for diverse precision formats.