An Open Source AI Compute Stack - Kubernetes + Ray + PyTorch + VLLM
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a comprehensive conference talk that examines the emerging software stack for running compute-intensive AI workloads at scale. Discover how the combination of Kubernetes, Ray, PyTorch, and vLLM addresses the challenges of AI workloads that require massive scale for both compute and data, along with significant heterogeneity across workloads, models, data types, and hardware accelerators. Learn about the fragmented and rapidly evolving nature of AI software stacks and understand why companies productionizing AI often need large AI platform teams to manage these complex workloads. Examine the specific role each framework plays in this stack and understand how Kubernetes, Ray, PyTorch, and vLLM operate together to create a cohesive solution. Gain insights through real-world case studies from major companies including Pinterest, Uber, and Roblox, and explore examples from today's most popular post-training frameworks. Understand the common patterns that are emerging within the fragmented AI landscape and how this open source stack provides a foundation for scalable AI infrastructure.
Syllabus
An Open Source AI Compute Stack: Kubernetes + Ray + PyTorch + VLLM - Robert Nishihara, Anyscale
Taught by
CNCF [Cloud Native Computing Foundation]