Unlocking Kubernetes Observability - Secure, Tenant-Centric Metrics for GPU Workloads
CNCF [Cloud Native Computing Foundation] via YouTube
Build with Azure OpenAI, Copilot Studio & Agentic Frameworks — Microsoft Certified
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn to implement secure, tenant-centric observability for multi-tenant Kubernetes clusters with GPU workloads through this 35-minute conference talk from KubeCon + CloudNativeCon. Discover how Adobe's Tenant Exporter enhances monitoring by delivering curated namespace metrics built on Prometheus, exposing critical data including ingress requests, container CPU/memory usage, GPU utilization, and resource quotas. Explore the comprehensive architecture featuring Prometheus for metric collection, Nginx proxy for load balancing, and secure authentication through prom-label-proxy with kube-rbac-proxy integration. Watch a live demonstration of configuring self-service metrics for GPU namespaces and see how users can select specific metrics via ConfigMap while managing system load through quotas. Master deployment strategies, quota management techniques, and scaling approaches for metric delivery across multiple clusters, drawing from Adobe's real-world experience managing thousands of namespaces. Gain practical insights into best practices for Kubernetes observability using CNCF tools to reduce operational overhead while improving system visibility and GPU workload optimization.
Syllabus
Unlocking Kubernetes Observability: Secure, Tenant-Cen... Bingi Narasimha Karthik & Ramkumar Nagaraj
Taught by
CNCF [Cloud Native Computing Foundation]