Learn the Skills Netflix, Meta, and Capital One Actually Hire For
Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the challenges and solutions in scaling data systems reliably in this 16-minute conference talk by Miriah Peterson at MLOps.community. Delve into the concept of Data Downtime and its impact on business outcomes. Learn how Data Reliability Engineering (DRE) teams adapt Site Reliability Engineering (SRE) practices to enhance data system reliability. Discover key strategies for strengthening data pipelines, including minimizing Data Downtime, implementing Data Service Level Metrics, and effective data monitoring techniques. Gain insights into distinguishing between system failures and data failures, and understand how to leverage metadata for preventative data engineering practices. Drawing from real-world scenarios, this talk provides practical knowledge for data professionals aiming to build more robust and reliable data systems.
Syllabus
Scaling Data Reliably: A Journey in Growing Through Data Pain Points // Miriah Peterson // DE4AI
Taught by
MLOps.community