Foundations of Data Visualization - Self Paced Online
Lead AI Strategy with UCSB's Agentic AI Program — Microsoft Certified
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore how to leverage eBPF technology for automatic GPU performance monitoring in this 21-minute conference talk from SREcon25 EMEA. Learn to capture CUDA calls made to GPUs, including kernel launches and memory allocations, without requiring intrusive instrumentation or imposing significant overhead on running applications. Discover how to export Prometheus metrics from eBPF probes to enable detailed analysis of kernel launch patterns and associated memory usage. Understand the key advantage of this approach: the ability to enable or disable instrumentation dynamically while GPU applications are running, making it particularly valuable for AI/ML training monitoring and profiling scenarios where you can start monitoring after training has already begun. Gain insights into implementing minimal-overhead GPU performance monitoring solutions that can be toggled on-demand for production environments.
Syllabus
SREcon25 Europe/Middle East/Africa - Auto-Instrumentation for GPU Performance using eBPF
Taught by
USENIX