AI Inference Without Boundaries - Dynamic Routing With Multi-Cluster Inference Gateway
CNCF [Cloud Native Computing Foundation] via YouTube
Our career paths help you become job ready faster
Google AI Professional Certificate - Learn AI Skills That Get You Hired
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to overcome GPU scarcity and scale AI inference workloads across multiple Kubernetes clusters in this 30-minute conference talk from CNCF. Discover the Multi-Cluster Inference Gateway, an open-source solution that dynamically routes AI inference traffic to available GPU resources across distributed clusters using Gateway API and multi-cluster patterns. Explore practical deployment strategies for maximizing GPU utilization, optimizing costs, and maintaining high availability for AI workloads that exceed single-cluster capacity. Gain insights into real-world implementation examples that demonstrate how to minimize latency while scaling AI serving infrastructure beyond traditional cluster boundaries, enabling intelligent traffic distribution based on resource availability across your distributed AI infrastructure.
Syllabus
AI Inference Without Boundaries: Dynamic Routing With Multi-Cluster In... Rob Scott & Daneyon Hansen
Taught by
CNCF [Cloud Native Computing Foundation]