Earn a Michigan Engineering AI Certificate — Stay Ahead of the AI Revolution
Learn the Skills Netflix, Meta, and Capital One Actually Hire For
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore comprehensive strategies for building and maintaining resilient cloud infrastructure in this 52-minute conference talk from AWS re:Invent 2025. Learn to leverage AWS native services including Amazon CloudWatch, AWS Systems Manager, and AWS CloudTrail to implement automatic anomaly detection and preventive measures. Discover how to architect robust logging systems that enable rapid incident investigation and provide continuous operational visibility across your infrastructure. Master the application of generative AI technologies for accelerated incident analysis and the creation of automated response playbooks that reduce manual intervention during critical events. Gain practical implementation patterns and architectural approaches for handling infrastructure failures, security incidents, and performance degradation scenarios. Understand how to maintain operational excellence at scale while minimizing recovery time and enhancing overall system resilience. Walk away with actionable strategies and real-world examples that can be immediately applied to strengthen your organization's reliability posture and operational capabilities.
Syllabus
AWS re:Invent 2025 - Elevating application reliability (COP336)
Taught by
AWS Events