NY State-Licensed Certificates in Design, Coding & AI — Online
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore how to scale observability strategies for AI workloads that generate massive amounts of telemetry data through spiky inference patterns, large GPU fleets, and complex orchestration pipelines in this AWS re:Invent 2025 conference talk. Discover why traditional cloud-native observability approaches often fail under AI workload demands and learn practical strategies to avoid costly trade-offs between data fidelity, system performance, and operational costs. Examine how observability requirements evolve across different segments of the AI ecosystem, including GPU providers, large language model builders, AI-native platforms, and organizations adopting AI features. Gain actionable insights for ensuring system reliability, controlling observability expenses, and maintaining comprehensive visibility across diverse AI infrastructure environments. Learn from real-world examples and best practices for implementing observability solutions that can handle the unique challenges of modern AI applications and services.
Syllabus
AWS re:Invent 2025 - Scaling Observability for the AI Era: From GPUs to LLMs (AIM121)
Taught by
AWS Events