KServe Next - Advancing Generative AI Model Serving
CNCF [Cloud Native Computing Foundation] via YouTube
AI, Data Science & Business Certificates from Google, IBM & Microsoft
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the evolution of generative AI model serving infrastructure in this conference talk that traces the journey from custom deployment patterns to modern Kubernetes-native serving platforms. Discover the latest challenges in deploying and scaling large language models, including inference performance optimization, KV-cache management, distributed execution strategies, and cost optimization techniques. Learn about the groundbreaking KServe v0.17 release, which introduces enhanced support for generative AI workloads through a dedicated LLMInferenceService Custom Resource Definition designed specifically for LLM-serving capabilities such as disaggregated serving, advanced model and KV caching mechanisms, and seamless integration with the open source Envoy AI Gateway. Gain valuable insights into the cutting-edge technologies driving the next generation of AI applications and understand how to effectively prepare your infrastructure for the generative AI revolution, ensuring scalable, efficient, and interoperable model serving solutions.
Syllabus
KServe Next: Advancing Generative AI Model Serving - Yuan Tang, Red Hat & Dan Sun, Bloomberg
Taught by
CNCF [Cloud Native Computing Foundation]