Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Databricks Lakehouse Fundamentals

Pragmatic AI Labs via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build data pipelines on the Databricks Lakehouse Platform — from architecture concepts to hands-on Spark and Delta Lake. This beginner course starts with why the lakehouse pattern replaced separate data warehouses and data lakes, then moves directly into the Databricks workspace where you'll configure compute, write PySpark and SQL queries, and manage data with Unity Catalog's three-level namespace. Week by week, you'll progress from navigating the platform to transforming DataFrames with select, filter, groupBy, and joins, then to creating Delta Lake tables with ACID transactions, schema enforcement, and time travel. You'll perform real DML operations — INSERT, UPDATE, DELETE, and MERGE — and learn to schedule production pipelines using Databricks Jobs with DAG-based orchestration. The course runs entirely on Databricks Free Edition, so there's no cloud billing. Six hands-on labs reinforce each module: explore the workspace, write notebook-based transformations, build Delta tables, and wire up an automated workflow. By the end, you'll have built a complete data engineering pipeline from raw ingestion through Delta Lake to scheduled production jobs.

Syllabus

  • Lakehouse Architecture and Workspace
    • This module introduces the lakehouse paradigm and the Databricks platform. You'll learn about the structure of lakehouse architecture, explore the Databricks workspace and its core tools, and understand how compute and storage work together.
  • Apache Spark on Databricks
    • This module covers notebooks and hands-on data manipulation using PySpark. You'll create and organize notebooks, load data from the Catalog, and write PySpark transformations to select, filter, aggregate, and join datasets.
  • Delta Lake Essentials
    • This module introduces Delta Lake, where you'll create Delta tables, perform transactional operations like updates, deletes, and merges, use time travel to query previous versions, and see how Delta Lake connects to governance and automation features.
  • Capstone
    • Build an end-to-end lakehouse data pipeline integrating every concept from the course. Starting from raw data files, you will construct a complete medallion architecture (bronze → silver → gold) with Delta Lake, implement incremental MERGE logic, and orchestrate the pipeline as a scheduled Databricks Job. Six hands-on lab notebooks guide you through the project using the course GitHub repository.

Taught by

Noah Gift

Reviews

Start your review of Databricks Lakehouse Fundamentals

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.