AI Adoption - Drive Business Value and Organizational Impact
AI Engineer - Learn how to integrate AI into software applications
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Skybridge, an innovative out-of-band replication system designed to provide bounded staleness guarantees for distributed caches in this 16-minute conference talk from OSDI '25. Discover how Meta Platforms Inc. addresses the consistency challenges inherent in globally distributed systems that rely on asynchronous replication to maintain high availability and low latency across multiple geographic locations. Explore the technical problem of eventual consistency causing issues ranging from minor annoyances to product-breaking bugs in Meta's services, and understand the research question of whether meaningful bounds can be placed on write visibility while preserving scalability. Examine Skybridge's architecture as a complementary system to the main replication pipeline that leverages existing reliable delivery streams while focusing on real-time update delivery. Analyze the performance results showing how Skybridge achieves 2-second bounded staleness for 99.99998% of writes compared to the main pipeline's 99.993% success rate, all while maintaining a lightweight footprint of only 0.54% the size of cache deployments. Gain insights into how this system avoids correlated failures and provides a practical solution for improving consistency guarantees in large-scale distributed caching systems without sacrificing the benefits of eventual consistency.
Syllabus
OSDI '25 - Skybridge: Bounded Staleness for Distributed Caches
Taught by
USENIX