Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

IT Infrastructure Reliability Through Monitoring and Alerting

Conf42 via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to build reliable IT infrastructure through comprehensive monitoring and alerting strategies in this 17-minute conference talk from Conf42 SRE 2025. Discover the critical importance of monitoring in cloud environments and explore popular monitoring tools used in the industry. Master essential metrics that should be tracked to ensure system reliability and understand the real-world impact these monitoring tools can have on your infrastructure. Examine the trade-offs and overheads associated with monitoring systems while developing effective alerting strategies that prevent alert fatigue. Implement best practices for both monitoring and alerting that will enhance your system's reliability and your team's response capabilities. Explore the future of monitoring through AI and machine learning technologies that are transforming how we approach infrastructure reliability. Gain practical insights from real-world scenarios and learn how to balance comprehensive monitoring with system performance considerations.

Syllabus

00:00 Introduction to Monitoring and Alerting
02:01 Importance of Monitoring in Cloud Environments
03:47 Popular Monitoring Tools
05:36 Essential Metrics to Monitor
07:41 Real-World Impact of Monitoring Tools
09:19 Trade-offs and Overheads of Monitoring
11:03 Effective Alerting Strategies
12:40 Best Practices for Monitoring and Alerting
14:34 Future of Monitoring: AI and Machine Learning
15:27 Conclusion and Final Thoughts

Taught by

Conf42

Reviews

Start your review of IT Infrastructure Reliability Through Monitoring and Alerting

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.