Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

KPerfIR - Towards an Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads

USENIX via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch this 14-minute conference talk from OSDI '25 presenting KPerfIR, a groundbreaking multi-level compiler-centric infrastructure for developing customizable performance analysis tools for modern AI workloads on GPUs. Learn how this novel approach integrates profiling capabilities directly into compiler workflows, enabling profiling functionalities to be implemented as compiler passes and creating a programmable, reusable framework for performance analysis. Discover how KPerfIR bridges the critical gap between compilers and profilers, providing fine-grained insights into complex optimization challenges such as overlapping execution of fine-grained function units on GPUs. Explore the integration with Triton infrastructure that demonstrates the power of compiler-centric approaches for advancing performance analysis and optimization in AI compiler development. Examine evaluation results showing the tool's impressive performance characteristics including low overhead at 8.2%, high accuracy with only 2% relative error, and its ability to deliver actionable insights into complicated GPU intra-kernel events, making it an essential advancement for the evolving landscape of AI compiler optimization.

Syllabus

OSDI '25 - KPerfIR: Towards a Open and Compiler-centric Ecosystem for GPU Kernel Performance...

Taught by

USENIX

Reviews

Start your review of KPerfIR - Towards an Open and Compiler-centric Ecosystem for GPU Kernel Performance Tooling on Modern AI Workloads

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.