Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn essential patterns, common pitfalls, and performance optimization techniques for deploying Large Language Models (LLMs) on Kubernetes in production environments during this 25-minute conference talk from OpenUK. Explore proven deployment strategies, discover key considerations for maintaining reliable LLM services, and gain practical insights into scaling and managing LLM workloads effectively within Kubernetes clusters.
Syllabus
Production-Ready LLMs on Kubernetes: Patterns, Pitfalls, and Performance
Taught by
OpenUK