Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to prevent workflow failures before they occur using gradient boosting techniques in this 16-minute conference talk from Conf42 ML 2026. Explore the challenges of long-running workflows that fail late in the process and discover how Prefect serves as an ideal platform for detecting early warning signals. Understand how to define failure as an SLA breach and implement proactive solutions through a comprehensive case study involving portfolio risk flows and early feature detection. Master the modeling approach using CatBoost, establish proper baselines, and evaluate model performance effectively. Gain insights into explaining risk factors through SHAP drivers that provide actionable information for developers. Follow along with a live demonstration in the Prefect UI comparing low-risk versus high-risk workflow runs. Discover the transition from demonstration to production environment, including necessary code changes, safe rollout strategies, and drift monitoring implementation. Explore advanced concepts beyond SLAs such as intelligent rerouting and smarter retry mechanisms to build more resilient machine learning workflows.
Syllabus
The Pain: Long Workflows That Fail Late
Why Prefect Is the Perfect Place for Early Signals
Defining Failure as an SLA Breach and What We Can Do About It
Case Study: Portfolio Risk Flow & Early Feature Signals
Modeling Approach: CatBoost, Baselines, and Evaluation
Explaining Risk: SHAP Drivers Developers Can Act On
Live Demo in Prefect UI: Low-Risk Run vs High-Risk Run
From Demo to Production: Code Changes, Safe Rollout, and Drift Monitoring
Beyond SLAs: Rerouting, Smarter Retries, and the Big Takeaway
Wrap-Up and Thanks
Taught by
Conf42