Sokovan - Container Orchestrator for Accelerated AI/ML Workloads and Massive Scale GPU Computing
OpenInfra Foundation via YouTube
Learn Backend Development Part-Time, Online
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn about a powerful Python-based container orchestrator in this 28-minute conference talk presented by Jeongkyu Shin and Joongi Kim. Discover how to efficiently manage resource-intensive batch workloads in containerized environments through acceleration-aware, multi-tenant scheduling capabilities. Explore the dual-layer scheduling system, featuring a cluster-level scheduler for customizable job placement strategies and workload control, alongside a node-level scheduler that optimizes container performance through automatic hardware accelerator mapping. Gain insights into how this solution outperforms traditional tools like Slurm for AI workloads, and understand its successful implementation across various industries for GPU-intensive tasks including AI training and services. Master the integration of multiple hardware acceleration technologies that help container-based MLOps platforms maximize the potential of cutting-edge hardware.
Syllabus
Sokovan Container Orchestrator for Accelerated AI:ML Workloads and Massive scale GPU Computing
Taught by
OpenInfra Foundation