Courses from 1000+ universities
Buried in Coursera’s 300-page prospectus: two failed merger attempts, competing bidders, a rogue shareholder, and a combined market cap that shrank from $3.8 billion to $1.7 billion.
600 Free Google Certifications
Computer Science
Psychology
Algorithms and Data Structures
Stanford Introduction to Food and Health
Gamification
Learn to Program: The Fundamentals
Organize and share your learning with Class Central Lists.
View our Lists Showcase
Discover Facebook's innovative load balancing techniques for handling over 1.3 billion users, improving performance, managing capacity, and enhancing reliability across global infrastructure.
Explore Shopify's journey from a single-database Rails app to a multi-datacenter setup, focusing on scalability, resiliency, and disaster recovery strategies for multi-tenant architectures.
Explore Google's Doorman system for global distributed client-side rate limiting, coordinating resource usage across multiple clients to prevent capacity overload.
Explore how restaurant operations parallel computer system reliability, drawing insights from dining experiences to enhance understanding of complex system management and fault tolerance.
Optimize high-scale production systems with Kafka and MySQL binlog for fresh, fast data distribution. Learn to balance caching tradeoffs and improve performance across thousands of services.
Strategies for managing metrics growth and cardinality in cloud-native environments, focusing on best practices, KPIs, and team collaboration to enhance observability and streamline remediation processes.
Explore reliability engineering beyond traditional SRE practices. Learn about concrete models, underlying mechanisms, and new strategies to enhance service reliability and tackle complex challenges.
Exploring hidden challenges in incident management, including diagnostic work, coordination costs, and decision-making dilemmas, with insights for improved recognition and handling.
Explore nine key questions for building effective infrastructure automation pipelines, focusing on modular design, intent-driven approaches, and seamless tool integration for continuous delivery.
Explore machine learning-driven automation for optimizing Kubernetes microservices and JVM settings, enhancing performance, efficiency, and cost-effectiveness in complex tech stacks.
Explore complex systems' traits and learn better approaches to incident analysis beyond linear root-cause methods. Gain insights from history, science, and philosophy to enhance understanding of resilience in modern organizations.
Explore Slack's evolution in incident management, covering strategies for handling numerous incidents, team-wide capability building, and future directions in maintaining platform reliability.
Explore techniques for managing and improving reliability in large-scale machine learning production systems, focusing on common failure modes, best practices, and practical strategies for SREs.
Explore a multi-year tech debt resolution journey, focusing on reducing database connections from 15,000 to under 100. Learn strategies for handling scaling challenges in distributed systems.
Explore System Dynamics for modeling feedback loops in distributed systems. Learn its history, tools, and applications to prevent outages and improve software architecture design.
Get personalized course recommendations, track subjects and courses with reminders, and more.