Get 20% off all career paths from fullstack to AI
Learn EDR Internals: Research & Development From The Masters
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore Iceberg's powerful metadata capabilities in this 38-minute conference talk from ApacheCon 2022. Dive into the "secret sauce" of Iceberg's rich metadata, which enables core features like time travel, query optimizations, and optimistic concurrency handling. Learn how to access and leverage system tables to gain valuable insights into your Iceberg data. Discover real-life queries for identifying recently updated partitions, investigating small file issues, and understanding data file filtering. Delve into advanced use cases such as data auditing and quality assessment, including tracking null value additions and data ingest latency. Gain practical tips for optimizing metadata table performance and stay updated on ongoing community improvements. Whether you're an experienced Iceberg user or just getting started, master this under-utilized feature to maximize your Iceberg implementation's potential.
Syllabus
Intro
What is Iceberg
Metadata files
Metadata tables
Partitions table
The newest table
Why are there so many tables
Partitions
Snapshots
Maintenance Operations
Expired Snapshots
Snapshots Summary
Optimize Metadata
Optimize Iceberg Data
Bonus
Data Quality
Puffin Files
Avro
Taught by
The ASF