Slurm Bridge - Slurm Scheduling Superpowers in Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Master Finance Tools - 35% Off CFI (Code CFI35)
AI Engineer - Learn how to integrate AI into software applications
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore how to enhance Kubernetes scheduling capabilities for multi-node AI inference, AI training, and High Performance Computing workloads through the integration of Slurm scheduling technology. Learn about the limitations of native Kubernetes scheduling for microservices when applied to complex workload placement scenarios that require optimal resource allocation. Discover how Slurm scheduling addresses critical factors like node hardware topology, workload planning, fair use policies, and batch-type scheduling that Kubernetes lacks natively. Understand the implementation of the recently released Slurm Bridge from the SlinkyProject, which leverages the Kubernetes Scheduling Framework combined with Slurm scheduler to make intelligent multi-node workload placement decisions. See demonstrations of how Slurm's fine-grained resource control works with DRA drivers for CPUs, NICs, and GPUs to elevate Kubernetes into a platform capable of large-scale, granular resource scheduling for demanding computational workloads.
Syllabus
Slurm Bridge: Slurm Scheduling Superpowers in Kubernetes - Alan Mutschelknaus & Tim Wickberg
Taught by
CNCF [Cloud Native Computing Foundation]