PowerBI Data Analyst - Create visualizations and dashboards from scratch
Get 35% Off CFI Certifications - Code CFI35
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This presentation by Vaibhav Katkade, Product Manager at Google Cloud Networking, explores infrastructure enhancements in cloud networking specifically designed for AI/ML workloads. Learn about the complete AI/ML lifecycle including training, fine-tuning, and inference, with detailed explanations of network requirements for each phase. Discover how Google Cloud's interconnect solutions enable fast, secure data transfer from on-premises environments, and how GKE clusters now support up to 65,000 nodes to accommodate large models like Gemini. Explore the innovative GKE inference gateway that optimizes LLM serving through intelligent load balancing based on KV cache utilization, resulting in 60% lower latency and 40% higher throughput. Understand how the gateway enables autoscaling based on model server metrics, supports multiplexing with LoRa fine-tuned adapters, and integrates with security tools like Google's Model Armor. The presentation addresses the challenge of GPU/TPU capacity constraints across regions with Google's solution for routing to available capacity through a single inference gateway, while providing platform teams centralized control and consistent security coverage across all models. Recorded at AI Infrastructure Field Day in Santa Clara on April 22, 2025.
Syllabus
Secure and optimize AI and ML workloads with the Cross-Cloud Network with Google Cloud
Taught by
Tech Field Day