Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Auto-instrumentation for GPU Performance Using eBPF

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to automatically instrument GPU performance monitoring using eBPF technology in this conference talk from KubeCon + CloudNativeCon. Discover the challenges of gathering telemetry from modern AI workloads that rely on expensive GPU fleets, where manual instrumentation creates performance overhead and lacks standardized output formats for visualization tools like Prometheus. Explore how eBPF can capture CUDA calls made to GPUs, including kernel launches and memory allocations, without requiring intrusive code changes. Understand the implementation of eBPF probes that export Prometheus metrics for detailed analysis of kernel launch patterns and memory usage patterns. Examine the benefits of this approach, including minimal performance overhead and the availability of open-source implementations on GitHub, making GPU performance optimization more accessible for cloud native environments.

Syllabus

Auto-instrumentation for GPU Performance Using eBPF - Annanay Agarwal, Grafana Labs

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Auto-instrumentation for GPU Performance Using eBPF

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.