Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Power BI Fundamentals - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the architecture and implementation of multi-agent AI systems for Site Reliability Engineering (SRE) in this 57-minute conference talk from AWS re:Invent 2025. Discover how to build AI systems that reason across the complete production context, including code, infrastructure, observability signals, and tribal knowledge to deliver actionable answers rather than raw data. Learn the technical approaches for creating autonomous root cause analysis capabilities that operate in minutes and enable intuitive "vibe debugging" for everyday engineering workflows. Examine strategies for abstracting complexity away from engineers while maintaining system effectiveness. Understand how enterprises are implementing Resolve AI solutions to accelerate decision-making processes, reduce operational toil, and overcome the limitations of traditional observability tools and challenges introduced by coding agents. Gain insights into the practical considerations for deploying multi-agent AI systems in production environments and the architectural patterns that enable effective AI-driven SRE operations.
Syllabus
AWS re:Invent 2025 - Building multi-agent AI SRE: from root cause to vibe debugging (AIM394)
Taught by
AWS Events