Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk explores strategies for scaling GPU clusters in Kubernetes environments as GPUs become more powerful and capable of handling concurrent workloads. Learn from NVIDIA's experience in right-sizing a Kubernetes control plane while meeting increasing business demands. Discover how to measure control plane resource consumption and implement techniques that improve performance and scalability, including golang tunables, kube-apiserver parameters like goaway-chance, and scheduler configurations. Understand the often overlooked impact of YAML volume per API call on system performance. Explore how simulation techniques such as KWOK (Kubernetes WithOut Kubelet) can be used to evaluate new Kubernetes features like Dynamic Resource Allocation (DRA) for control-plane scalability before production deployment.
Syllabus
Scaling GPU Clusters Without Melting Down! - Alay Patel & Ryan Hallisey, NVIDIA
Taught by
CNCF [Cloud Native Computing Foundation]