Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore Sokovan, a standalone open-source AI container orchestration engine designed for both bare-metal and Kubernetes-based environments in this 33-minute conference talk. Learn how this specialized engine handles AI workloads with direct hardware acceleration control while maintaining compatibility with Kubernetes infrastructure. Discover the groundbreaking achievements of the world's first container-based GDS-enabled AI cluster from 2022, which delivered over 120 Gb/s throughput with multi-node GPU workloads and 150+ Gb/s RDMA performance. Understand the innovative dual orchestrator architecture where Kubernetes provides the foundational platform while Sokovan delivers AI-optimized scheduling, fractional GPU scaling, and vendor-aware storage acceleration. Examine practical deployment strategies for air-gapped environments and performance comparisons demonstrating superior results compared to pure Kubernetes solutions. Gain insights into phased migration approaches that allow parallel operation of both orchestrators, enabling seamless workload transitions while maintaining peak performance. Master strategies for operating dual orchestrators effectively, with Kubernetes managing infrastructure components and Sokovan handling AI acceleration requirements to achieve high-performance computing without compromising either system's capabilities.
Syllabus
Sokovan: An Acceleration-first Kubernetes-compatible AI Container Engine - Joongi Kim, Lablup Inc.
Taught by
Linux Foundation