Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

No More GPU Cold Starts - Making Serverless ML Inference Truly Real-Time

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to eliminate GPU cold start delays in serverless machine learning inference through this 31-minute conference talk from CNCF. Discover why GPU-based serverless ML inference suffers from cold starts that can extend response times from milliseconds to minutes, significantly impacting real-time performance and increasing costs. Explore the technical anatomy of GPU cold starts in modern ML serving stacks, including how container initialization, GPU driver loading, and heavyweight model deserialization create bottlenecks. Understand the unique challenges GPUs introduce to cold-path delays and examine how Container Runtime Interface (CRI) and device plugins contribute to startup latency. Gain insights into what occurs during PyTorch model boot-up on fresh pods and learn production-ready strategies to reduce startup latency, including implementing pre-warmed GPU pod pools to bypass initialization time, utilizing model snapshotting with TorchScript or ONNX for faster deserialization, and applying lazy loading techniques that defer model initialization until the first request arrives. Master these optimization approaches to maintain fast, efficient, and production-ready ML inference services while eliminating the performance penalties associated with GPU cold starts.

Syllabus

No More GPU Cold Starts: Making Serverless ML Inference Truly Real-Time - Nikunj Goyal & Aditi Gupta

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of No More GPU Cold Starts - Making Serverless ML Inference Truly Real-Time

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.