Partitionable Devices - Putting the Dynamic Back in Dynamic Resource Allocation
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore how Dynamic Resource Allocation (DRA) revolutionizes GPU partitioning in Kubernetes through this 32-minute conference talk from CNCF. Learn to overcome the traditional challenges of using NVIDIA's Multi-Instance GPUs (MIGs) in Kubernetes, which previously required static pre-provisioning or specialized tooling. Discover how the latest DRA implementation enables on-demand provisioning of GPU partitions based on workload requirements, allowing you to simply specify memory needs and have Kubernetes dynamically create appropriately sized partitions. Understand the practical applications for inference workloads with smaller models that don't require full GPU resources, and see how this approach extends to other accelerator technologies like Google's TPU. Gain insights into optimizing GPU utilization and witness the technology in action through live demonstrations, presented by experts from Google and NVIDIA who detail the technical implementation and real-world benefits of this dynamic resource allocation approach.
Syllabus
Partitionable Devices: Putting the “Dynamic” Back in... Morten Jæger Torkildsen & Jan-Philip Gehrcke
Taught by
CNCF [Cloud Native Computing Foundation]