Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
The Fastest Way to Become a Backend Developer Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This conference talk explores strategies for scaling GPU clusters in Kubernetes environments as GPUs become more powerful and capable of handling concurrent workloads. Learn from NVIDIA's experience in right-sizing a Kubernetes control plane while meeting increasing business demands. Discover how to measure control plane resource consumption and implement techniques that improve performance and scalability, including golang tunables, kube-apiserver parameters like goaway-chance, and scheduler configurations. Understand the often overlooked impact of YAML volume per API call on system performance. Explore how simulation techniques such as KWOK (Kubernetes WithOut Kubelet) can be used to evaluate new Kubernetes features like Dynamic Resource Allocation (DRA) for control-plane scalability before production deployment.
Syllabus
Scaling GPU Clusters Without Melting Down! - Alay Patel & Ryan Hallisey, NVIDIA
Taught by
CNCF [Cloud Native Computing Foundation]