Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Building Scalable ML Inferencing Pipelines Using Kubernetes

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a conference talk that delves into building robust and scalable Machine Learning inference pipelines using Kubernetes. Learn how to construct performant inferencing services that can handle on-demand scaling while maintaining optimal latency. Discover proven procedures and guidelines for managing inference pipelines on Kubernetes, including detailed insights into hardware requirements (GPU/CPU/memory) and essential K8s configurations for various inference engines. Master the implementation of fault-tolerant pipelines for Large Language Models (LLM) and Retrieval-Augmented Generation (RAG) using fundamental Kubernetes constructs such as operators, statefulsets, and persistent volumes. Gain practical knowledge about setting up automated monitoring systems and implementing effective strategies for troubleshooting and fixing hardware and software component failures in production environments.

Syllabus

Scalable ML Inferencing Pipeline Using K8s - Smitha Jayaram & Vinod Eswaraprasad, NVIDIA

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Building Scalable ML Inferencing Pipelines Using Kubernetes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.