Chaos Engineering in Kubernetes - Breaking Your Product to Build Resilience
Platform Engineering via YouTube
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Get 35% Off CFI Certifications - Code CFI35
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Discover how to implement chaos engineering in Kubernetes environments through this 13-minute conference talk that demonstrates building system resilience by intentionally introducing failures. Learn to move beyond traditional testing strategies that fail to prepare systems for real-world cloud environment unpredictability by exploring a practical Kubernetes-native chaos engineering framework. Examine the architecture and implementation of a custom Kubernetes operator, cluster manager, database pods, and load generators that work together to create a robust platform for simulating failure scenarios and observing system behavior under stress. Understand how to design and execute automated chaos experiments while implementing real-time health checks and comprehensive observability using industry-standard tools like Grafana, Loki, and Prometheus. Gain insights into the organizational impact of chaos engineering practices, including key lessons learned, implementation challenges, and how these methodologies strengthen both product reliability and team processes. Acquire practical knowledge for designing your own chaos experiments to build more resilient, cloud-native systems that can withstand the complexities of modern distributed architectures.
Syllabus
Chaos engineering in Kubernetes: Breaking your product to build resilience - Kumar Shivendu
Taught by
Platform Engineering