Lead AI-Native Products with Microsoft's Agentic AI Program
Master Production-Ready Machine Learning, Step by Step
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Listen to a 32-minute podcast interview with Tanya Reilly, Principal Engineer at Squarespace and former Google SRE staff member, as she draws fascinating parallels between New York City's fire code evolution and modern software reliability practices. Explore core Site Reliability Engineering (SRE) principles, including the "You Build It, You Run It" philosophy, strategies for building reliable systems, and the importance of considering dependencies and instrumentation from initial design. Learn valuable lessons from fire safety regulations that apply to software development, such as implementing fireproof walls, conducting software inspections, and utilizing circuit breakers. Gain insights into error budgets, Service Level Objectives (SLOs), and risk management in software systems. Understand how SRE roles vary across organizations, from pattern recommendations to emergency response, while discovering best practices for building sustainable systems that don't require constant monitoring. The discussion covers essential aspects of software lifecycle reliability, prevention strategies, testing methodologies, and practical approaches to disaster prevention in modern software development.
Syllabus
Tanya Reilly on Site Reliability Engineering and the Evolution of the New York City Fire Code
Taught by
InfoQ