Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Scalable LLM Inference on Kubernetes With NVIDIA NIMS, LangChain, Milvus and FluxCD

Linux Foundation via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore architecting and implementing a scalable LLM inference service on Amazon EKS in this 33-minute conference talk from the Linux Foundation. Dive deep into workload orchestration using Kubernetes as the foundation while integrating NVIDIA NIMS for optimal GPU utilization, LangChain for flexible LLM operations, and Milvus for efficient vector storage. Learn how to leverage FluxCD for GitOps-driven deployments, implement Karpenter for horizontal scaling, and establish comprehensive observability with Prometheus and Grafana. Discover best practices for building production-ready large language model inference systems that can scale effectively in cloud-native environments, combining cutting-edge AI technologies with robust Kubernetes orchestration patterns.

Syllabus

Scalable LLM Inference on Kubernetes With NVIDIA NIMS, LangChain, Milvus and Flu... Riccardo Freschi

Taught by

Linux Foundation

Reviews

Start your review of Scalable LLM Inference on Kubernetes With NVIDIA NIMS, LangChain, Milvus and FluxCD

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.