Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

How We are Dealing with Metrics at Scale on GitLab.com

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore how GitLab.com tackles the challenge of managing metrics at scale in this 20-minute conference talk by Andrew Newdigate. Discover the innovative approach to cataloging key metrics for each application component, leading to automated generation of Grafana dashboards, robust alerting rules, and high-quality SLA indicators. Learn about the evolution from drowning in data during incidents to implementing a streamlined system for effective monitoring. Gain insights into dealing with multiple silos, split brain alerts, and the use of Thanos Rule. While primarily focused on Prometheus users, the fundamental concepts presented can be applied to various metrics systems. Dive into topics such as key metrics, alerts, metrics catalogs, and example configurations to enhance your understanding of large-scale metrics management.

Syllabus

Intro
Key Metrics
Alerts
Metrics Catalog
Simplest Approach
Multiple Silos
Split Brain Alerts
Thanos Rule
Example Configuration
Conclusion
Resources

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of How We are Dealing with Metrics at Scale on GitLab.com

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.