Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Analyzing data with Python is a key skill for aspiring Data Scientists and Analysts!
This course takes you from the basics of importing and cleaning data to building and evaluating predictive models. You’ll learn how to collect data from various sources, wrangle and format it, perform exploratory data analysis (EDA), and create effective visualizations. As you progress, you’ll build linear, multiple, and polynomial regression models, construct data pipelines, and refine your models for better accuracy.
Through hands-on labs and projects, you’ll gain practical experience using popular Python libraries such as Pandas, NumPy, Matplotlib, Seaborn, SciPy, and Scikit-learn. These tools will help you manipulate data, create insights, and make predictions.
By completing this course, you’ll not only develop strong data analysis skills but also earn a Coursera certificate and an IBM digital badge to showcase your achievement.
Syllabus
- Importing Data Sets
- This module introduces the foundational skills required to begin data analysis using Python. You will learn how to understand dataset structures, identify key variables, and import data from different sources using Python libraries such as Pandas and NumPy. The module also explores how to retrieve data from databases using SQLite and perform basic dataset exploration. Through hands-on labs, you will practice importing and examining real-world datasets such as laptop pricing and used car pricing.
- Data Wrangling
- This module focuses on preparing data for analysis through essential data wrangling techniques. You will learn how to clean, transform, and format datasets by handling missing values, converting data types, normalizing numerical values, and creating bins for analysis. The module also introduces methods for transforming categorical variables into numerical representations suitable for modeling. Through hands-on exercises, you will apply these techniques to real-world datasets.
- Exploratory Data Analysis
- This module develops your ability to analyze and understand datasets through exploratory data analysis techniques. You will learn how to calculate descriptive statistics, perform correlation analysis, and apply grouping techniques to uncover relationships between variables. The module also introduces data visualization methods and statistical tests such as the chi-square test for categorical variables. Through practical labs, you will analyze datasets to identify trends, patterns, and potential insights.
- Model Development
- This module introduces the fundamentals of building predictive models using regression techniques. You will learn how to construct simple linear, multiple linear, and polynomial regression models to analyze relationships between variables. The module also covers methods for evaluating model performance using metrics such as R-squared and Mean Squared Error. Visualization techniques such as residual plots and KDE plots are used to assess how well models fit the data.
- Model Evaluation and Refinement
- This module focuses on improving model performance through evaluation and optimization techniques. You will learn how to detect overfitting and underfitting and apply strategies to improve model generalization. The module introduces ridge regression and hyperparameter tuning using grid search to refine predictive models. Through hands-on labs, you will evaluate and improve regression models using real-world datasets.
- Final Assignment
- In this module, you will apply the full data analysis workflow learned throughout the course. You will import, clean, analyze, and model real-world datasets to generate insights and predictions. The module includes a practice project and a final project that simulate real data analysis scenarios. You will also complete a final exam to demonstrate your understanding of key concepts in Python-based data analysis.
Taught by
Joseph Santarcangelo
Tags
Reviews
4.0 rating, based on 2 Class Central reviews
4.7 rating at Coursera based on 19677 ratings
Showing Class Central Sort
-
It takes a few hours to complete. The course provides some basic lessons on working with data in Python. I think there are some better introductions available.
-
The instructors demonstrated a clear and thorough understanding of the subject matter, making complex concepts accessible to learners of varying backgrounds. The course content was well-structured, starting with fundamental concepts before gradually…