Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Help! My LLM Is a Resource Hog - How We Tamed Inference With Kubernetes and Open Source Muscle

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to optimize large language model (LLM) inference performance and resource management using Kubernetes and open-source CNCF tools in this 26-minute conference talk. Discover practical solutions for addressing common LLM deployment challenges including slow inference speeds, unpredictable GPU usage, and escalating costs through a real-world case study presented by experts from Forrester Research and vCluster. Master the implementation of KServe and Kubeflow for reliable LLM serving, explore benchmarking and auto-scaling techniques using Volcano and KEDA to optimize resource utilization and reduce latency, and understand how to monitor model performance and detect drift using Prometheus, Grafana, and OpenTelemetry. Gain insights from field-tested architectures, performance benchmarks, and lessons learned while building production-ready, efficient, and scalable LLM inference systems using entirely open-source tooling that you can implement immediately.

Syllabus

Help! My LLM Is a Resource Hog: How We Tamed Inference With Kubernetes... Aditya Soni & Hrittik Roy

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Help! My LLM Is a Resource Hog - How We Tamed Inference With Kubernetes and Open Source Muscle

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.