Architecting Resilience: Lessons from Managing 7000+ Kubernetes Clusters at Scale
CNCF [Cloud Native Computing Foundation] via YouTube
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Master Agentic AI, GANs, Fine-Tuning & LLM Apps
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore the challenges and solutions in managing over 7,000 Kubernetes clusters at scale in this conference talk from KubeCon + CloudNativeCon. Gain insights into architecting resilient systems as Kakao's private Kubernetes as a Service team members share their experiences following a significant data center fire. Learn about the economic and social impacts of the incident, and discover the team's approach to providing highly available Kubernetes clusters efficiently for developers. Delve into design ideas for cluster high-availability, implementation challenges, and concerns encountered while managing a vast infrastructure of 100,000+ nodes. Understand the importance of resilience in cloud-native environments and how to apply these lessons to your own Kubernetes deployments.
Syllabus
Architecting Resilience: Lessons from Managing 7K+ Kubernetes Clusters at Scale
Taught by
CNCF [Cloud Native Computing Foundation]