Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore the critical issue of GPU underutilization in modern AI and cloud environments through this 43-minute conference talk from DevConf.IN 2026. Learn why expensive GPU resources often sit idle in data centers, driving up costs and reducing team productivity, even in well-managed environments. Discover the root causes of low GPU utilization including uneven workloads, inefficient scheduling systems, limited visibility into GPU usage patterns, and hardware-software mismatches. Gain practical knowledge of actionable solutions such as GPU sharing techniques, workload right-sizing strategies, improved scheduling algorithms, and effective monitoring tools that can be implemented in real production systems. Examine real-world insights from building a GPU-as-a-Service (GPUaaS) platform, including advanced features like model checkpointing, job preemption and resume capabilities, and queue-based scheduling using open-source tools such as Kueue to maximize GPU efficiency. Master the fundamentals of GPU usage patterns and common pitfalls while developing a comprehensive understanding of how to optimize GPU utilization in AI, machine learning, and cloud computing environments without adding unnecessary complexity to your infrastructure.
Syllabus
The GPU Utilization Problem: What’s Going Wrong and How to Solve It - DevConf.IN 2026
Taught by
DevConf