This course explores the fundamentals of relational databases and how to seamlessly map Python data structures to robust database tables using object-relational mappers (ORMs). You'll gain practical experience in building efficient ETL (Extract, Transform, Load) pipelines, ensuring your data is not only accessible but also reliable and persistent. You'll learn about data validation and quality control, leveraging powerful tools like Pandas to explore, clean, and analyze your datasets. By the end of the course, you’ll be equipped to uncover insights, identify biases, and apply best practices in data management.
Overview
Syllabus
- Data Science Fundamentals Part 1: Unit 3
- This module guides learners through essential data handling skills, from storing and persisting data using relational databases and object-relational mappers, to validating, exploring, and transforming data for analysis. Emphasizing practical techniques with tools like Pandas, the lessons cover best practices for querying, managing missing values, and using descriptive statistics and visualizations to understand data quality and distribution. The module provides a systematic approach to the ETL process, equipping students to efficiently prepare data for deeper analytical modeling.
Taught by
Pearson and Jonathan Dinu