The Most Addictive Python and SQL Courses
AI Adoption - Drive Business Value and Organizational Impact
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore advanced streaming data practices with Apache Iceberg in this comprehensive conference talk that addresses the unique challenges of real-time data processing beyond traditional batch operations. Discover practical solutions for managing small file creation, optimizing partitioning and sorting strategies, and fine-tuning write configurations to build reliable streaming pipelines. Learn how to handle high-throughput compaction scenarios cost-effectively, manage late-arriving data, and work with large numbers of manifests while maintaining consistent query planning performance. Master techniques for implementing aggressive snapshot expiration strategies, utilizing Storage Partition Join (SPJ) in Spark for efficient merge operations, and orchestrating maintenance jobs to avoid commit conflicts in streaming environments. Gain actionable insights through real-world examples and practical demonstrations that will help you scale existing streaming platforms or build new ones with Iceberg, focusing on performance optimization, cost management, and architectural best practices for modern data platforms.
Syllabus
Streaming with Iceberg: From Zero to Hero
Taught by
StreamNative