Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Discover how Google Kubernetes Engine (GKE) and AI Hypercomputer are revolutionizing AI infrastructure in this informative presentation by Ishan Sharma, Group Product Manager in the Google Kubernetes Engine team. Learn about industry-leading infrastructure that enables training at mega scale, serving with reduced cost and latency, providing economic access to GPUs and TPUs, and accelerating time to value. Understand Google Cloud's commitment to making new accelerators available on GKE from day one, and how the AI Hypercomputer stack—the same one used internally for Vertex AI—functions as a complete reference architecture. Explore Cluster Director for GKE, which allows deployment, scaling, and management of AI-optimized clusters where co-located accelerators work as a single unit for high performance and ultra-low latency. See demonstrations of the GKE Inference Gateway that enhances LLM response routing based on server metrics, and the GKE Inference Quickstart feature that recommends optimized infrastructure configurations for different models. This 28-minute talk was recorded live in Santa Clara on April 22, 2025, as part of AI Infrastructure Field Day.
Syllabus
Google Kubernetes Engine and AI Hypercomputer with Google Cloud
Taught by
Tech Field Day