Automate your data pipelines with Apache Airflow. This hands-on course starts with the basics and builds up to real-world orchestration, from retries to Spark and external data ingestion.
- How to build and schedule workflows with Apache Airflow
- Core concepts of DAGs, tasks, operators, and scheduling
- Error handling, retries, and making workflows fault-tolerant
- Ensuring idempotency and robustness in data pipelines
- Using sensors to wait for external systems or events
- Orchestrating Apache Spark jobs within Airflow
- Connecting Airflow with external data sources
- Ingesting data into a data lake with automation