Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Building Automated Data Pipelines with Spark,dbt,and Airflow

Coursera via Coursera

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
You'll master the art of building production-ready data pipelines that automatically process millions of records. In this hands-on course, you'll design end-to-end workflows that integrate diverse data sources—from databases and APIs to real-time streams—using industry-standard tools like Apache Spark, dbt, and Apache Airflow. You'll learn to create robust data models that preserve historical changes, implement performance optimizations that reduce processing time by 30% or more, and build automated workflows with intelligent retry logic and monitoring alerts. By the end, you'll have created a complete data pipeline system that demonstrates the technical skills data engineering teams need most. You'll know how to unify fragmented data sources, apply advanced transformation techniques, and ensure your pipelines run reliably at scale. This practical experience directly translates to the challenges you'll face as a data engineer, data analyst, or anyone working with large-scale data systems in modern organizations.

Syllabus

  • Understanding Data Flow Diagram Fundamentals
    • You will learn the foundational concepts and tools needed to create systematic visual documentation of data pipeline architectures.
  • Creating Comprehensive Data Flow Diagrams
    • You will apply advanced techniques to create professional-quality data flow diagrams that accurately represent complex enterprise data systems and support stakeholder collaboration.
  • Modular Pipeline Development - Foundation & Core
    • You will establish the foundational understanding and core skills for creating modular data pipeline stages, focusing on the principles of separation of concerns and tool integration fundamentals.
  • Pipeline Implementation & Integration - Application & Assessment
    • You will implement complete end-to-end data pipelines by integrating modular components with industry-standard tools, culminating in comprehensive assessment of their pipeline development capabilities.
  • Connector Configuration Foundations
    • You will establish foundational knowledge of connector architecture and complete their first database connector configuration using Airbyte.
  • Unified Data Integration Implementation
    • You will implement complete multi-source data integration by configuring streaming and API connectors, applying enterprise security patterns, and demonstrating mastery through comprehensive connector configuration.
  • SCD2 Historical Tracking Fundamentals
    • You will understand the fundamental concepts of SCD2 logic and begin applying these principles to create data models that preserve historical context in enterprise data warehouses.
  • dbt SCD2 Model Implementation
    • You will implement production-ready SCD2 models using dbt, creating automated historical tracking systems with proper change detection, validity periods, and current status management.
  • Workflow Design Principles - Foundation
    • You will understand the foundational concepts and design principles for creating robust data workflows with Apache Airflow.
  • Production Implementation - Core Application & Assessment
    • You will implement production-grade Airflow workflows with retry mechanisms, SLA monitoring, and parameterization for enterprise-ready data pipeline resilience.
  • Project: Building Automated Data Pipelines with Spark, dbt, and Airflow
    • You will integrate data engineering skills to build a complete automated data pipeline that processes diverse data sources, applies historical tracking, and orchestrates workflows. This project synthesizes mapping, transformation, integration, modeling, and automation capabilities into a production-ready data system.

Taught by

Professionals from the Industry

Reviews

Start your review of Building Automated Data Pipelines with Spark,dbt,and Airflow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.