Scaling Private LLM Model Services with Kserve and Modelcar OCI - A Real-World Implementation
CNCF [Cloud Native Computing Foundation] via YouTube
The Most Addictive Python and SQL Courses
Google, IBM & Meta Certificates — 40% Off for a Limited Time
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to effectively deploy and scale private Large Language Models (LLMs) in a conference talk that showcases a real-world implementation using Kserve and Modelcar OCI. Explore the complexities of LLM deployment and discover how Kubernetes, particularly Kserve with Modelcar OCI storage backend, streamlines the process. Dive into practical demonstrations of Kserve's capabilities for efficient model serving within Kubernetes environments, optimizing GPU utilization and enabling seamless integration. Understand how Modelcar OCI artifacts enhance artifact delivery beyond traditional container images, resulting in reduced storage duplication, faster download speeds, and simplified governance. Gain valuable insights into implementation strategies, best practices, and real-world lessons for improving MLOps workflows. Master the techniques to leverage Kubernetes, Kserve, and OCI artifacts effectively, leading to significant efficiency improvements and solutions to common challenges in private LLM service deployment and scaling.
Syllabus
Scaling Private LLM Model Services with Kserve and Modelcar OCI: A Real-World... - Mayuresh Krishna
Taught by
CNCF [Cloud Native Computing Foundation]