AI Inference Without Boundaries - Dynamic Routing With Multi-Cluster Inference Gateway
CNCF [Cloud Native Computing Foundation] via YouTube
Earn a Michigan Engineering AI Certificate — Stay Ahead of the AI Revolution
Google, IBM & Microsoft Certificates — All in One Plan
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to overcome GPU scarcity and scale AI inference workloads across multiple Kubernetes clusters in this 30-minute conference talk from CNCF. Discover the Multi-Cluster Inference Gateway, an open-source solution that dynamically routes AI inference traffic to available GPU resources across distributed clusters using Gateway API and multi-cluster patterns. Explore practical deployment strategies for maximizing GPU utilization, optimizing costs, and maintaining high availability for AI workloads that exceed single-cluster capacity. Gain insights into real-world implementation examples that demonstrate how to minimize latency while scaling AI serving infrastructure beyond traditional cluster boundaries, enabling intelligent traffic distribution based on resource availability across your distributed AI infrastructure.
Syllabus
AI Inference Without Boundaries: Dynamic Routing With Multi-Cluster In... Rob Scott & Daneyon Hansen
Taught by
CNCF [Cloud Native Computing Foundation]