Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Efficient LLM Deployment - A Unified Approach with Ray, VLLM, and Kubernetes

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a comprehensive conference talk that dives deep into deploying Large Language Models (LLMs) efficiently using an integrated approach combining Ray, vLLM, and Kubernetes. Learn how to leverage cloud-native orchestration, distributed computing, and advanced LLMOps in the wake of ChatGPT's transformative impact on the AI landscape. Discover the core components of a modern AI stack, including Kubernetes for managing AI workloads across cloud environments, Ray for developing and scaling distributed applications, and vLLM for high-performance, memory-efficient model inference and serving. Gain practical insights into architecting and integrating these powerful tools to drive innovation and optimize AI solution deployment, while addressing challenges such as GPU shortages and result verification.

Syllabus

Effecient LLM Deployment: A Unified Approach with Ray, VLMM, and Kubernetes - L (Xiaoxuan) Liu

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Efficient LLM Deployment - A Unified Approach with Ray, VLLM, and Kubernetes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.