Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DataCamp

Apache Iceberg: From Zero to Production Data Lakehouse

via DataCamp

Overview

Build production-ready Apache Iceberg lakehouses: model, migrate, and maintain tables at scale.


Data lakes promised cheap, flexible storage at scale, but they left engineers fighting broken queries, painful schema changes, and no reliable way to update data. Apache Iceberg fixes that by bringing database-grade reliability to the data lake. In this course, you'll build production-ready Iceberg lakehouses from the ground up: model and partition data for fast queries, migrate existing tables without downtime, evolve schemas and partitions safely with Git-like workflows, and keep everything performant through compaction, snapshot management, and smart write strategies. By the end, you'll be ready to design and operate a modern lakehouse with confidence.

Syllabus

  • Apache Iceberg Fundamentals
    • In this chapter, learn how to set up an Apache Iceberg lakehouse and model data with hidden partitioning so your queries skip the files they don't need.
  • Taking Advantage of Apache Iceberg Tables
    • In this chapter, learn how to migrate existing data into Iceberg and safely evolve schemas and partitions using Git-like workflows such as Write-Audit-Publish, branching, and tagging.
  • Operating and Optimizing Apache Iceberg
    • In this final chapter, learn how to keep production Iceberg tables fast at scale through smart write strategies, safe concurrency, and maintenance operations like compaction and snapshot expiration.

Taught by

Snowflake Northstar

Reviews

Start your review of Apache Iceberg: From Zero to Production Data Lakehouse

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.