Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Apache Kafka and Apache Iceberg: Real-Time Data Streaming into Modern Data Lakes

GOTO Conferences via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch this 36-minute conference talk from GOTO Chicago 2024 exploring the integration of Apache Kafka with Apache Iceberg for modern data lake solutions. Learn how Kafka functions as a platform for real-time streaming data capture and discover the advantages of Apache Iceberg table format for data lakes and warehouses, including schema evolution, ACID transactions, hidden partitioning, and time traveling capabilities. Explore the complexities of streaming data from Kafka into Iceberg-based data lakes and understand how Confluent Tableflow can streamline this process. Through a practical demonstration, gain insights into implementing Kafka-Iceberg integration for enhanced real-time analytics. The presentation covers fundamental concepts of both technologies, addresses common challenges in data streaming to data lakes, and provides solutions for efficient data management in modern architectures.

Syllabus

00:00 Intro
01:10 Overview
02:06 Kafka is the standard for operational data
03:41 Iceberg for analytical data in data lakes
04:42 Apache Iceberg
05:27 Why Iceberg?
12:24 Structure of an Iceberg table
16:40 Streaming to data lakes is complicated
20:47 Tableflow materialize Kafka topics as Iceberg tables
23:47 Demo
35:37 Outro

Taught by

GOTO Conferences

Reviews

Start your review of Apache Kafka and Apache Iceberg: Real-Time Data Streaming into Modern Data Lakes

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.