Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

AWS Data Pipelines and Orchestration with Airflow

via Udacity

Go to class Write review

Details

Go to class

Provider

Udacity
Pricing

Paid Course
Languages

English
Certificate

Certificate Available
Effort

9 hours
Sessions

Self-Paced
Level

Intermediate

Found in

Overview

Master Apache Airflow to build, schedule, and monitor data pipelines on AWS. Start with Airflow fundamentals—DAGs, tasks, XCom, Jinja templating, branching, and runtime configuration—then apply production patterns like single-responsibility design, data intervals, asset-driven scheduling, and dynamic task mapping. Build complete ETL and ELT pipelines that move data through S3 into Amazon Redshift using SQL operators, template inheritance, and data constraint checks. Then construct a modern data lakehouse using S3, Glue, Iceberg, and Athena, automating ingestion and promotion through bronze, silver, and gold layers while handling schema evolution. Deploy pipelines to Amazon MWAA and apply monitoring and observability best practices for production environments.

Syllabus

Introduction to Data Pipelines and Airflow

Discover how Apache Airflow orchestrates data pipelines as code. Author DAGs and tasks, pass data with XCom, configure runtime parameters, apply Jinja templating, and build branching dependencies.

Data Lineage and Orchestration

Orchestrate production pipelines with schedules, data intervals, and catchup. Apply single-responsibility design, debug with flatfile snapshots, trigger DAGs on asset events, and map tasks dynamically

Orchestrating Warehouse Workflows with Amazon Redshift

Build end-to-end ETL and ELT pipelines that move data through S3 into Redshift. Use SQL operators, Jinja template inheritance, and data constraint checks, then deploy to production with Amazon MWAA.

Orchestrating Lakehouse Workflows with AWS Glue and Athena

Build a lakehouse on AWS with S3, Glue, Iceberg, and Athena. Automate ingestion, handle schema evolution with crawlers, and promote data through bronze, silver, and gold layers.

AWS Data Lakehouse Pipeline for Sparkify

Design an event-driven lakehouse with Airflow, S3, Glue, Iceberg, and Athena. Build three asset-triggered DAGs that ingest, transform, and promote data through raw, transaction, and analytics layers.