Retrofitting OTEL Collectors and Prometheus - How To Overcome Scale and Design Limitations
CNCF [Cloud Native Computing Foundation] via YouTube
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Launch Your Cybersecurity Career in 6 Months
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to overcome scale and design limitations when implementing OpenTelemetry Collectors and Prometheus in large-scale enterprise environments through practical retrofitting solutions. Explore real-world challenges faced at eBay, including managing span traffic across multiple Kubernetes clusters and regions, where service graph connectors require global span routing to single collectors, demanding substantial memory resources for various trace durations. Discover how Prometheus lacks native support for long-term retention of exemplars, creating operational constraints. Examine innovative approaches using ClickHouse-based internal trace stores to sustainably provide spans to service graph connectors for metrics generation, and implement ClickHouse solutions for long-term exemplar retention. Gain insights into practical workarounds that avoid the common pitfalls of extensive customization or complete project abandonment when standard OTEL and Prometheus configurations don't align with organizational design requirements or scaling needs.
Syllabus
Retrofitting OTEL Collectors & Prometheus - How To Overcome Scale... Vijay Samuel & Sandeep Raveesh
Taught by
CNCF [Cloud Native Computing Foundation]