What you'll learn:
- Build Python scripts for data extraction by interacting with APIs using Postman, loading into the data warehouse and transforming (ELT)
- Use PostgreSQL as a data warehouse. Interact with the data warehouse using both psql & DBeaver
- Discover how to containerize data applications using Docker, making your data pipelines portable and easy to scale.
- Master the basics of orchestrating and automating your data workflows with Apache Airflow, a must-have tool in data engineering.
- Understand how to perform unit, integration & end-to-end (E2E) tests using a combination of pytest and Airflow's DAG tests to validate your data pipelines.
- Implement data quality tests using SODA to ensure your data meets business and technical requirements.
- Learn to automate deployment pipelines using GitHub Actions to ensure smooth, continuous integration and delivery.
Data Engineering is the backbone of modern data-driven companies. To excel, you need experience with the tools and processes that power data pipelines in real-world environments. This course gives you practical, project-based learning with the following tools PostgreSQL, Python, Docker, Airflow, Postman, SODA and Github Actions. I will guide you as to how you can use these tools.
What you will learn in the course:
Python for Data Engineering: Build Python scripts for data extraction by interacting with APIs using Postman, loading into the data warehouse and transforming (ELT)
SQL for Data Pipelines: Use PostgreSQL as a data warehouse. Interact with the data warehouse using both psql & DBeaver
Docker for Containerized Deployments: Discover how to containerize data applications using Docker, making your data pipelines portable and easy to scale.
Airflow for Workflow Automation: Master the basics of orchestrating and automating your data workflows with Apache Airflow, a must-have tool in data engineering.
Testing and Data Quality Assurance: Understand how to perform unit, integration & end-to-end (E2E) tests using a combination of pytest and Airflow's DAG tests to validate your data pipelines. Implement data quality tests using SODA to ensure your data meets business and technical requirements.
CI/CD for Automated Testing & Deployment: Learn to automate deployment pipelines using GitHub Actions to ensure smooth, continuous integration and delivery.