Retrofitting OTEL Collectors and Prometheus - How To Overcome Scale and Design Limitations
CNCF [Cloud Native Computing Foundation] via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how to overcome scale and design limitations when implementing OpenTelemetry Collectors and Prometheus in large-scale enterprise environments through practical retrofitting solutions. Explore real-world challenges faced at eBay, including managing span traffic across multiple Kubernetes clusters and regions, where service graph connectors require global span routing to single collectors, demanding substantial memory resources for various trace durations. Discover how Prometheus lacks native support for long-term retention of exemplars, creating operational constraints. Examine innovative approaches using ClickHouse-based internal trace stores to sustainably provide spans to service graph connectors for metrics generation, and implement ClickHouse solutions for long-term exemplar retention. Gain insights into practical workarounds that avoid the common pitfalls of extensive customization or complete project abandonment when standard OTEL and Prometheus configurations don't align with organizational design requirements or scaling needs.
Syllabus
Retrofitting OTEL Collectors & Prometheus - How To Overcome Scale... Vijay Samuel & Sandeep Raveesh
Taught by
CNCF [Cloud Native Computing Foundation]