Scaling Private LLM Model Services with Kserve and Modelcar OCI - A Real-World Implementation
CNCF [Cloud Native Computing Foundation] via YouTube
Google AI Professional Certificate - Learn AI Skills That Get You Hired
The Most Addictive Python and SQL Courses
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to effectively deploy and scale private Large Language Models (LLMs) in a conference talk that showcases a real-world implementation using Kserve and Modelcar OCI. Explore the complexities of LLM deployment and discover how Kubernetes, particularly Kserve with Modelcar OCI storage backend, streamlines the process. Dive into practical demonstrations of Kserve's capabilities for efficient model serving within Kubernetes environments, optimizing GPU utilization and enabling seamless integration. Understand how Modelcar OCI artifacts enhance artifact delivery beyond traditional container images, resulting in reduced storage duplication, faster download speeds, and simplified governance. Gain valuable insights into implementation strategies, best practices, and real-world lessons for improving MLOps workflows. Master the techniques to leverage Kubernetes, Kserve, and OCI artifacts effectively, leading to significant efficiency improvements and solutions to common challenges in private LLM service deployment and scaling.
Syllabus
Scaling Private LLM Model Services with Kserve and Modelcar OCI: A Real-World... - Mayuresh Krishna
Taught by
CNCF [Cloud Native Computing Foundation]