Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Google Cloud

AI Infrastructure: Networking Techniques

Google Cloud via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Welcome to the ""AI Infrastructure: Networking Techniques"" course. While AI Hypercomputer is renowned for its massive computational power using GPUs and TPUs, the secret to unlocking its full potential lies within the network. High-performance computing and large-scale model training demand incredibly fast, low-latency connections to continuously feed processors with data. In this course, you will learn to leverage Google Cloud's high-bandwidth, low-latency infrastructure to optimize data transfer and communication between all the components of your AI system. By the end, you will grasp the critical role networking plays across the entire AI pipeline from data ingestion and training to inference and be able to apply best practices to ensure your workloads run at maximum speed.

Syllabus

  • Course overview
    • This module offers an overview of the course and outlines the learning objectives.
  • Introduction
    • This module details the specialized networking requirements for AI workloads compared to traditional web applications. It covers the specific bandwidth and latency demands of each pipeline stage—from ingestion to inference—and analyzes the "rail-aligned" network architectures of Google Cloud's A3 and A4 GPU machine types designed to maximize "Goodput."
  • Networking for data ingestion
    • This module details strategies for efficiently moving massive datasets into the cloud. It covers the use of the Cross-Cloud Network and Cloud Interconnect to establish high-bandwidth pipelines, and outlines configuration best practices—such as enabling Jumbo Frames (MTU)—to reduce protocol overhead and optimize throughput.
  • Networking for AI training
    • This module details the critical role of low-latency networking in distributed model training. It covers the necessity of Remote Direct Memory Access (RDMA) for gradient synchronization, the benefits of Google's Titanium offload architecture in freeing up CPU resources, and the topology choices required to scale clusters without bottlenecks.
  • Networking for inference
    • This module details the networking challenges specific to Generative AI inference, such as bursty traffic and long-lived connections. It covers optimizing Time-to-First-Token using the GKE Inference Gateway and "Queue Depth" routing, while also addressing best practices for network reliability and Identity and Access Management (IAM).
  • Course Resources
    • Student PDF links to all modules

Taught by

Google Cloud Training

Reviews

Start your review of AI Infrastructure: Networking Techniques

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.