Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems

USENIX via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to automatically derive semantic checkers from system tests to detect silent failures in production distributed systems through this 18-minute conference talk from OSDI '25. Explore a novel approach that addresses the challenge of detecting semantic violations in distributed systems that occur without explicit errors, which traditionally requires extensive domain knowledge and manual effort to identify. Discover findings from a large-scale study on existing system test cases and understand how the T2C framework uses static and dynamic analysis to transform and generalize tests into runtime checkers. Examine the practical application of this methodology across four major distributed systems, where researchers successfully derived tens to hundreds of checkers that detected 15 out of 20 real-world silent failures while maintaining low runtime overhead. Gain insights into how this automated approach can significantly improve the reliability and correctness of production distributed systems by leveraging existing test infrastructure to create effective runtime monitoring capabilities.

Syllabus

OSDI '25 - Deriving Semantic Checkers from Tests to Detect Silent Failures in Production...

Taught by

USENIX

Reviews

Start your review of Deriving Semantic Checkers from Tests to Detect Silent Failures in Production Distributed Systems

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.