Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

IBM

Data Engineering Capstone Project

IBM via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Showcase your skills in this Data Engineering project! In this course you will apply a variety of data engineering skills and techniques you have learned as part of the previous courses in the IBM Data Engineering Professional Certificate. You will demonstrate your knowledge of Data Engineering by assuming the role of a Junior Data Engineer who has recently joined an organization and be presented with a real-world use case that requires architecting and implementing a data analytics platform. In this Capstone project you will complete numerous hands-on labs. You will create and query data repositories using relational and NoSQL databases such as MySQL and MongoDB. You’ll also design and populate a data warehouse using PostgreSQL and IBM Db2 and write queries to perform Cube and Rollup operations. You will generate reports from the data in the data warehouse and build a dashboard using Cognos Analytics. You will also show your proficiency in Extract, Transform, and Load (ETL) processes by creating data pipelines for moving data from different repositories. You will perform big data analytics using Apache Spark to make predictions with the help of a machine learning model. This course is the final course in the IBM Data Engineering Professional Certificate. It is recommended that you complete all the previous courses in this Professional Certificate before starting this course.

Syllabus

  • Data Platform Architecture and OLTP Database
    • In this module, you will design a data platform that uses MySQL as an OLTP database. You will be using MySQL to store the OLTP data.
  • Querying Data in NoSQL Databases
    • In this module, you will design a data platform that uses MongoDB as a NoSQL database. You will use MongoDB to store the e-commerce catalog data.
  • Build a Data Warehouse
    • In this module you will design and implement a data warehouse and you will then generate reports from the data in the data warehouse.
  • Data Analytics
    • In this module, you will assume the role of a data engineer at an e-commerce company. Your company has finished setting up a data warehouse. Now you are assigned the responsibility to design a reporting dashboard that reflects the key metrics of the business.
  • ETL & Data Pipelines
    • In this module, you will perform ETL operations to move transactional data from an OLTP database (MySQL) into a data warehouse (PostgreSQL).Finally, you will implement and automate an ETL pipeline in Python that extracts daily incremental records from the production database, transforms them as needed, and loads them into the warehouse. Once the ETL process is established, you will extend it further using Apache Airflow, a powerful workflow orchestration tool. You will design DAGs (Directed Acyclic Graphs) that define task dependencies, automate the extraction and transformation of web server logs, and archive processed data for downstream analytics.
  • Big Data Analytics with Spark
    • In this module, you will use the data from a webserver to analyse search terms. You will then load a pretrained sales forecasting model and predict the sales forecast for a future year.
  • Final Project
    • In this module, you will make a final submission of all the labs you’ve completed throughout the course for evaluation.You can choose to have your submission evaluated by an AI tool or through a peer-graded review.

Taught by

Rav Ahuja and Ramesh Sannareddy

Reviews

4.7 rating at Coursera based on 142 ratings

Start your review of Data Engineering Capstone Project

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.