Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Empower Large Language Models Serving in Production with Cloud Native AI Technologies

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the challenges and solutions for deploying Large Language Models (LLMs) in production environments using Cloud Native AI technologies. Learn how to optimize LLM serving by extending KServe to handle OpenAI's streaming requests, reducing model loading time with Fluid and Vineyard, and implementing cost-effective auto-scaling strategies. Gain insights from KServe and Fluid maintainers on overcoming production challenges, and discover practical techniques for balancing performance and cost in LLM deployments. Understand the importance of timed auto-scaling with cronHPA and evaluate the cost-effectiveness of scaling processes. Benefit from real-world experiences and best practices for effectively utilizing Cloud Native AI in production environments.

Syllabus

Empower Large Language Models (LLMs) Serving in Production with Cloud Native...- Lize Cai & Yang Che

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Empower Large Language Models Serving in Production with Cloud Native AI Technologies

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.