Scaling Thanos and Prometheus for Massive Metrics Deployment at Reddit
CNCF [Cloud Native Computing Foundation] via YouTube
The Private Equity Associate Certification
Learn Generative AI, Prompt Engineering, and LLMs for Free
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore how Reddit scales its monitoring infrastructure using Thanos and Prometheus in this informative conference talk. Discover the custom monitoring operator developed by Reddit to manage thousands of Prometheus instances, handling over 45 million samples per second and 600 million active series. Learn about the Kubernetes controller used to orchestrate this massive deployment and how Thanos enables long-term storage and global querying capabilities. Gain insights into the tools developed by Reddit's team, the challenges they faced, and the solutions implemented to achieve a robust and scalable metrics system for one of the world's largest social media platforms.
Syllabus
Scaling Thanos at Reddit - Ben Kochie & Trevor Riles, Reddit
Taught by
CNCF [Cloud Native Computing Foundation]