Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about a novel runtime approach for detecting protocol implementation bugs in distributed systems through this 16-minute conference presentation from NSDI '25. Discover how Runtime Protocol Refinement Checking (RPRC) addresses the critical problem of safety bugs that occur when programmers convert protocol descriptions into actual code implementations in distributed protocol implementations like Chubby or Etcd. Explore the Ellsberg system, which observes deployed DPI runtime behavior and alerts operators to protocol implementation bugs without requiring assumptions about implementation details or additional coordination overhead. Understand how the system leverages the principle that protocol safety properties are maintained when all live processes correctly implement the protocol, using message comparison between actual DPI processes and simulated protocol execution. Examine practical applications of this approach through case studies involving three open-source distributed systems: Etcd, Zookeeper, and Redis Raft, demonstrating successful detection of previously reported protocol bugs in production environments.