Courses from 1000+ universities
$7.2 billion in combined revenue since 2020. $8 billion in lost market value. This merger marks the end of an era in online education.
600 Free Google Certifications
Computer Science
Information Technology
Data Analysis
The Science of Gastronomy
Transforming Digital Learning: Learning Design Meets Service Design
Intelligenza Artificiale
Organize and share your learning with Class Central Lists.
View our Lists Showcase
Demystify Kubernetes networking, exploring CNI, kube-proxy, VPCs, and service meshes. Gain insights into eBPF usage and learn to debug complex networking issues in cloud-native environments.
Explore strategies for effective SRE team structuring, focusing on agency and accountability in microservices environments. Learn to balance philosophical and pragmatic approaches for optimal performance.
Explore best practices for ensuring reliable data in large ML models, focusing on challenges in LLMs and applying SRE principles to manage data risks in production systems.
Explore Prometheus's native histograms for improved SRE tooling. Learn practical usage, SLO tracking, and troubleshooting techniques to enhance your monitoring capabilities.
Explore continuous profiling techniques to optimize resource usage, reduce cloud costs, and enhance software performance across multiple languages in cloud-native environments.
Explore hosting GPT models in Kubernetes, covering GPU sharding, tensor parallelism, and model optimization. Learn trade-offs between latency, accuracy, and resource allocation, with a live demo showcasing performance.
Explore Datadog's response to a massive global outage, including technical challenges, solutions, and incident management strategies for coordinating 500+ engineers over 2+ days.
Explore the architecture, construction, and operation of Meta's 5-exaflop AI Research SuperCluster, supporting large-scale model training with insights on monitoring and observability for cutting-edge AI development.
Explore the importance of noticing the ordinary in diverse fields like government, astronomy, and SRE. Learn to identify automation opportunities and balance human intervention in predictable scenarios.
Explore a critical incident where a single configuration change caused widespread website outages, examining the recovery process, lessons learned, and subsequent improvements in Kubernetes infrastructure.
Practical guide for ML monitoring: essential metrics, prioritization, and tool selection. Insights from 30+ ML use cases to detect issues in real-time and overcome traditional software monitoring limitations.
Explore a complex Postgres cluster outage, learn to reproduce failures locally, and gain insights into debugging complex systems and understanding database internals.
Discover strategies to optimize Node.js deployments, prevent resource exhaustion, and improve system efficiency. Learn mathematical insights and practical solutions for scaling Node.js applications effectively.
Discover how Grafana Labs scaled their memcached 42x, improving reliability and cost-effectiveness for large-scale data querying using cloud-based local SSDs and innovative caching strategies.
Explore QUIC and HTTP/3 deployment challenges, debugging techniques, and real-world case studies. Learn about encryption, load balancing, and troubleshooting tools for modern networking protocols.
Get personalized course recommendations, track subjects and courses with reminders, and more.