Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

KServe Next - Advancing Generative AI Model Serving

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the evolution of generative AI model serving infrastructure in this conference talk that traces the journey from custom deployment patterns to modern Kubernetes-native serving platforms. Discover the latest challenges in deploying and scaling large language models, including inference performance optimization, KV-cache management, distributed execution strategies, and cost optimization techniques. Learn about the groundbreaking KServe v0.17 release, which introduces enhanced support for generative AI workloads through a dedicated LLMInferenceService Custom Resource Definition designed specifically for LLM-serving capabilities such as disaggregated serving, advanced model and KV caching mechanisms, and seamless integration with the open source Envoy AI Gateway. Gain valuable insights into the cutting-edge technologies driving the next generation of AI applications and understand how to effectively prepare your infrastructure for the generative AI revolution, ensuring scalable, efficient, and interoperable model serving solutions.

Syllabus

KServe Next: Advancing Generative AI Model Serving - Yuan Tang, Red Hat & Dan Sun, Bloomberg

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of KServe Next - Advancing Generative AI Model Serving

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.