Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Smart GPU Management - Dynamic Pooling, Sharing, and Scheduling for AI Workloads in Kubernetes

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to optimize GPU utilization for AI workloads in Kubernetes through dynamic pooling, sharing, and scheduling techniques in this conference talk. Explore the challenges of balancing performance, flexibility, and isolation in GPU management, often referred to as the "Impossible Trinity." Discover the pros and cons of various GPU sharing technologies including vCUDA, MPS, and MIG, and understand the complexities that arise when managing clusters with multiple sharing techniques due to differing resource names and configurations. See how to combine these methods seamlessly by allowing users to specify memory and core count requirements without needing to manage GPU types or sharing methods directly. Understand how the system automatically selects the best node and method based on user preferences and available GPU resources, translates requests into optimal profiles, and dynamically partitions GPUs. Examine how this approach streamlines GPU management, enhances utilization, and improves scheduling by integrating Volcano and HAMi solutions to strengthen GPU pooling and scheduling capabilities for AI workload management.

Syllabus

Smart GPU Management: Dynamic Pooling, Sharing, and Scheduling for AI Work... Wei Chen & Mengxuan Li

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Smart GPU Management - Dynamic Pooling, Sharing, and Scheduling for AI Workloads in Kubernetes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.