Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Listen to a 32-minute podcast interview with Tanya Reilly, Principal Engineer at Squarespace and former Google SRE staff member, as she draws fascinating parallels between New York City's fire code evolution and modern software reliability practices. Explore core Site Reliability Engineering (SRE) principles, including the "You Build It, You Run It" philosophy, strategies for building reliable systems, and the importance of considering dependencies and instrumentation from initial design. Learn valuable lessons from fire safety regulations that apply to software development, such as implementing fireproof walls, conducting software inspections, and utilizing circuit breakers. Gain insights into error budgets, Service Level Objectives (SLOs), and risk management in software systems. Understand how SRE roles vary across organizations, from pattern recommendations to emergency response, while discovering best practices for building sustainable systems that don't require constant monitoring. The discussion covers essential aspects of software lifecycle reliability, prevention strategies, testing methodologies, and practical approaches to disaster prevention in modern software development.
Syllabus
Tanya Reilly on Site Reliability Engineering and the Evolution of the New York City Fire Code
Taught by
InfoQ