Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Meta's Kubernetes-based Portable AI Research Environment

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how Meta developed a Kubernetes-based portable AI research environment in this conference talk from KubeCon + CloudNativeCon. Discover Meta's collaboration with CoreWeave to implement SUNK (Slurm on Kubernetes), creating a unified computing platform that enables AI researchers to work consistently across diverse multi-cloud infrastructures. Explore how this solution addresses the growing demands of AI research by providing a familiar Slurm interface while abstracting away underlying infrastructure complexity through Kubernetes orchestration. Understand the architecture that delivers secure per-user isolation, shared storage mounts, streamlined access management, and comprehensive health checking across heterogeneous environments. Examine how the platform enables infrastructure engineers to deploy consistent, portable solutions across multiple cloud providers while maintaining deep centralized observability and unified security controls. Gain insights into novel patterns for enabling users to deploy infrastructure on Kubernetes without realizing the underlying complexity, and learn how OpenTelemetry serves as a unified interface for both platform-level and research-level monitoring and insights.

Syllabus

Meta’s Kubernetes-based Portable AI Research Environment - Shaun Hopper, Meta & Navarre Pratt

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Meta's Kubernetes-based Portable AI Research Environment

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.