AI, CERN, and the Quest for GPU Custody: How CERN Leverages DRA for Efficient GPU Sharing
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
This conference talk explores how CERN utilizes Dynamic Resource Allocation (DRA) for efficient GPU sharing in Kubernetes environments. Learn about the current state of DRA, implementation updates, and feature additions as presenters Diana Gaponcic from CERN and Jan-Philip Gehrcke from NVIDIA guide you through getting started with DRA and explain its relevance for engineers looking to enhance GPU offerings on their clusters. Discover configuration techniques for time-slicing, MPS, and MIG, along with building custom layouts. The presentation demonstrates CERN's practical application of DRA for colocating machine learning workloads on the same GPU, including how to select appropriate sharing mechanisms based on performance requirements. Examine comprehensive training and inference benchmarking results, understand how DRA creates a flexible and user-friendly system, and explore the tradeoffs of GPU sharing while learning how this approach can ultimately conserve valuable resources.
Syllabus
AI, CERN, and the Quest for GPU Custody: How CERN Leverages D... Diana Gaponcic & Jan-Philip Gehrcke
Taught by
CNCF [Cloud Native Computing Foundation]