Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks DLT, Protobuf and BSR

Databricks via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to build a streaming-native data system that unifies file-based ingestion and real-time user edits using Databricks Delta Live Tables (DLT), Protobuf, and Buf Schema Registry (BSR) in this 40-minute conference talk. Discover the Red Stapler system architecture that merges different data sources into a single DLT pipeline for near real-time feedback while maintaining data quality and governance. Explore how Protobuf definitions managed in BSR enforce schema and data-quality rules while ensuring backward compatibility across system updates. Understand the implementation of SCD Type 2 tables that store all records regardless of validity, capturing complete version history and enabling immediate quarantine views for invalid data. Master the configuration-driven approach that allows easy adaptation to evolving survey definitions without production risks, while leveraging DLT Serverless and Kafka-compatible Bufstream for cost-effective scaling that reduces to zero during idle periods. Gain insights into achieving consistent validation, quick updates, and comprehensive audit trails essential for building trustworthy and flexible data pipelines in production environments.

Syllabus

Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks DLT, Protobuf and BSR

Taught by

Databricks

Reviews

Start your review of Unifying Human-Curated Data Ingestion and Real-Time Updates with Databricks DLT, Protobuf and BSR

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.