Learn Generative AI, Prompt Engineering, and LLMs for Free
AI Engineer - Learn how to integrate AI into software applications
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the evolution of AWS load balancing technologies and discover new capabilities for optimizing AI/ML application performance in this 59-minute conference talk from AWS re:Invent 2025. Dive deep into how AWS networking services are transforming AI/ML workloads through practical demonstrations of AWS Network Load Balancer configurations for ultra-low latency search and real-time AI/ML inference scenarios. Learn advanced optimization techniques for Amazon API Gateway specifically designed for low-concurrency large language model (LLM) workloads. Examine real-world case studies and production deployment examples that showcase proven methods for minimizing network latency in AI/ML pipelines. Master architecture patterns for high-performance inference serving and discover how to leverage advanced AWS networking capabilities to achieve superior AI application performance across various use cases.
Syllabus
AWS re:Invent 2025 - Deep dive: The evolution of AWS load balancing and new capabilities (NET334)
Taught by
AWS Events