Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Udacity

AWS Data Pipelines and Orchestration with Airflow

via Udacity

Overview

Master Apache Airflow to build, schedule, and monitor data pipelines on AWS. Start with Airflow fundamentals—DAGs, tasks, XCom, Jinja templating, branching, and runtime configuration—then apply production patterns like single-responsibility design, data intervals, asset-driven scheduling, and dynamic task mapping. Build complete ETL and ELT pipelines that move data through S3 into Amazon Redshift using SQL operators, template inheritance, and data constraint checks. Then construct a modern data lakehouse using S3, Glue, Iceberg, and Athena, automating ingestion and promotion through bronze, silver, and gold layers while handling schema evolution. Deploy pipelines to Amazon MWAA and apply monitoring and observability best practices for production environments.

Syllabus

  • Introduction to Data Pipelines and Airflow
    • Discover how Apache Airflow orchestrates data pipelines as code. Author DAGs and tasks, pass data with XCom, configure runtime parameters, apply Jinja templating, and build branching dependencies.
  • Data Lineage and Orchestration
    • Orchestrate production pipelines with schedules, data intervals, and catchup. Apply single-responsibility design, debug with flatfile snapshots, trigger DAGs on asset events, and map tasks dynamically
  • Orchestrating Warehouse Workflows with Amazon Redshift
    • Build end-to-end ETL and ELT pipelines that move data through S3 into Redshift. Use SQL operators, Jinja template inheritance, and data constraint checks, then deploy to production with Amazon MWAA.
  • Orchestrating Lakehouse Workflows with AWS Glue and Athena
    • Build a lakehouse on AWS with S3, Glue, Iceberg, and Athena. Automate ingestion, handle schema evolution with crawlers, and promote data through bronze, silver, and gold layers.
  • AWS Data Lakehouse Pipeline for Sparkify
    • Design an event-driven lakehouse with Airflow, S3, Glue, Iceberg, and Athena. Build three asset-triggered DAGs that ingest, transform, and promote data through raw, transaction, and analytics layers.

Taught by

Sean Murdock

Reviews

5 rating at Udacity based on 1 rating

Start your review of AWS Data Pipelines and Orchestration with Airflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.