Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch this 36-minute conference talk from GOTO Chicago 2024 exploring the integration of Apache Kafka with Apache Iceberg for modern data lake solutions. Learn how Kafka functions as a platform for real-time streaming data capture and discover the advantages of Apache Iceberg table format for data lakes and warehouses, including schema evolution, ACID transactions, hidden partitioning, and time traveling capabilities. Explore the complexities of streaming data from Kafka into Iceberg-based data lakes and understand how Confluent Tableflow can streamline this process. Through a practical demonstration, gain insights into implementing Kafka-Iceberg integration for enhanced real-time analytics. The presentation covers fundamental concepts of both technologies, addresses common challenges in data streaming to data lakes, and provides solutions for efficient data management in modern architectures.
Syllabus
00:00 Intro
01:10 Overview
02:06 Kafka is the standard for operational data
03:41 Iceberg for analytical data in data lakes
04:42 Apache Iceberg
05:27 Why Iceberg?
12:24 Structure of an Iceberg table
16:40 Streaming to data lakes is complicated
20:47 Tableflow materialize Kafka topics as Iceberg tables
23:47 Demo
35:37 Outro
Taught by
GOTO Conferences