Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Multi-Cluster Wars - The Scheduler Awakens

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore how multi-cluster batch schedulers address the scalability challenges of AI/ML workloads in this conference talk from KubeCon + CloudNativeCon. Learn about the limitations of single-cluster schedulers when handling millions of diverse, resource-intensive batch jobs that require GPU bursts for training and CPU/memory-intensive preprocessing tasks. Discover how multi-cluster schedulers federate Kubernetes clusters to dynamically extend capacity across on-premises and cloud environments while providing tenant isolation and zone outage resilience. Examine the implementation of critical batch scheduling features including globally coordinated preemption for optimal capacity reclamation, fair-share quota enforcement to ensure equitable compute distribution among teams, and gang scheduling that reserves resources across clusters for synchronized multi-node job launches. Gain insights into the architectural approaches used by multi-cluster schedulers to overcome ETCD scalability limits and single-region failure domains while maintaining efficient resource utilization across federated Kubernetes environments.

Syllabus

Multi-Cluster Wars: The Scheduler Awakens - Dejan Pejchev & Priyanka Ravi, G-Research

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Multi-Cluster Wars - The Scheduler Awakens

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.