Master Production-Ready Machine Learning, Step by Step
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
Google, IBM & Meta Certificates – 40% Off
One plan covers every Professional Certificate on Coursera.
Unlock All Certificates
Learn about Skybridge, an innovative out-of-band replication system designed to provide bounded staleness guarantees for distributed caches in this 16-minute conference talk from OSDI '25. Discover how Meta Platforms Inc. addresses the consistency challenges inherent in globally distributed systems that rely on asynchronous replication to maintain high availability and low latency across multiple geographic locations. Explore the technical problem of eventual consistency causing issues ranging from minor annoyances to product-breaking bugs in Meta's services, and understand the research question of whether meaningful bounds can be placed on write visibility while preserving scalability. Examine Skybridge's architecture as a complementary system to the main replication pipeline that leverages existing reliable delivery streams while focusing on real-time update delivery. Analyze the performance results showing how Skybridge achieves 2-second bounded staleness for 99.99998% of writes compared to the main pipeline's 99.993% success rate, all while maintaining a lightweight footprint of only 0.54% the size of cache deployments. Gain insights into how this system avoids correlated failures and provides a practical solution for improving consistency guarantees in large-scale distributed caching systems without sacrificing the benefits of eventual consistency.
Syllabus
OSDI '25 - Skybridge: Bounded Staleness for Distributed Caches
Taught by
USENIX