CafeGPT - Serving LLMs Like Coffee With Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Learn Backend Development Part-Time, Online
Gain a Splash of New Skills - Coursera+ Annual Just ₹7,999
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn the fundamentals of serving Large Language Models (LLMs) using Kubernetes through an engaging coffee shop analogy in this 26-minute conference talk. Explore how Kubernetes has become the standard platform for LLM workloads while understanding the core concepts of LLM inference, efficient deployment strategies, and GPU scheduling without getting overwhelmed by the rapidly evolving ecosystem. Discover how to decouple fundamental principles from the diverse features offered by various Kubernetes-based solutions today. Master the intersection of Kubernetes and LLM inference systems through practical insights that make complex concepts accessible, all while learning parallels to running a successful cafe operation.
Syllabus
CafeGPT: Serving LLMs Like Coffee With Kubernetes - Madhav Jivrajani & Kartik Ramesh, UIUC
Taught by
CNCF [Cloud Native Computing Foundation]