Get 20% off all career paths from fullstack to AI
AI Engineer - Learn how to integrate AI into software applications
Overview
Build a Learning Habit
Download Class Central's free printable study calendar
Download for Free
Explore the challenges and solutions for running stateful services on DC/OS in this 27-minute conference talk by Nathan Shimek from New Context. Learn about overcoming obstacles such as volume pinning, dynamic provisioning limitations, and fixed resource requirements when deploying mission-critical applications on DC/OS. Gain insights into real-world projects conducted with major container users, addressing key issues like failure domains, production-quality design, and operator safety. Discover strategies for field failure testing, preventing cascading failures, and managing multidisciplinary infrastructure. Delve into crucial aspects of cluster management, platform security, maintenance, and externalizing services. Understand the importance of organizational guardrails and training in successful DC/OS deployments.
Syllabus
Introduction
Challenges
Agenda
Failure Domains
Design for Production Quality
Are Operators Safe
Field Failure Testing
Cascading Failure
Infrastructure is Multidisciplinary
Cluster Management
Platform Security
Maintenance
Externalizing Services
Organization
guardrails
training
Taught by
Linux Foundation