Real-Time Mode Technical Deep Dive - How We Built Sub-300 Millisecond Streaming Into Apache Spark
Databricks via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the technical architecture behind Apache Spark's new real-time execution mode that achieves sub-300 millisecond p99 latencies for streaming queries in this 32-minute conference talk. Dive deep into the core architectural innovations including concurrent stage scheduling and non-blocking shuffle mechanisms that enable dramatic latency improvements while maintaining Spark's fault-tolerance guarantees. Learn about specific optimizations made to streaming SQL operators and discover how these enhancements deliver up to 10x lower latencies for real-time enrichment pipelines and feature engineering workloads in machine learning applications. Gain insights from Databricks engineers Jerry Peng and Siying Dong as they detail the technical challenges and solutions that make previously impossible low-latency use cases achievable in Apache Spark Structured Streaming.
Syllabus
Real-Time Mode Technical Deep Dive: How We Built Sub-300 Millisecond Streaming Into Apache Sparkâ„¢
Taught by
Databricks