Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

IBM

Data Science with R - Capstone Project

IBM via Coursera

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
In this capstone course, you will apply various data science skills and techniques that you have learned as part of the previous courses in the IBM Data Science with R Specialization or IBM Data Analytics with Excel and R Professional Certificate. For this project, you will assume the role of a Data Scientist who has recently joined an organization and be presented with a challenge that requires data collection, analysis, basic hypothesis testing, visualization, and modeling to be performed on real-world datasets. You will collect and understand data from multiple sources, conduct data wrangling and preparation with Tidyverse, perform exploratory data analysis with SQL, Tidyverse and ggplot2, model data with linear regression, create charts and plots to visualize the data, and build an interactive dashboard. The project will culminate with a presentation of your data analysis report, with an executive summary for the various stakeholders in the organization.

Syllabus

  • Module 1 - Capstone Overview and Data Collection
    • In this module, you will be introduced to the capstone project scenario and the real-world problem you will solve throughout this course. You will begin applying the data acquisition techniques learned in earlier courses to collect project data from multiple sources. You will gather data using web scraping methods to extract information from HTML pages and use API requests to retrieve external data such as weather information. The collected datasets will be organized into structured formats, preparing them for further analysis in the subsequent stages of the project.
  • Module 2 - Data Wrangling
    • In this module, you will apply data wrangling techniques learned in previous courses to clean and prepare the collected datasets for analysis. Working with the data gathered in Module 1, you will transform raw data into a structured and analysis-ready format. You will clean text data, standardize variables, handle missing values, and perform data transformations such as encoding and normalization. By the end of this module, you will have prepared a reliable dataset that supports meaningful exploration and modeling in later stages of the project.
  • Module 3: Performing Exploratory Data Analysis with SQL, Tidyverse & ggplot2
    • At this stage of the capstone project, you will apply the data collection and data wrangling skills developed in the previous modules, along with your prior experience in SQL querying and data visualization. This module focuses on performing Exploratory Data Analysis (EDA) to better understand the patterns, relationships, and trends within the prepared datasets. You will work with the datasets generated in earlier modules to explore key variables, identify meaningful insights, and prepare the data for predictive modeling. If you encountered challenges in earlier steps, prepared datasets are available to help you continue progressing through the project. In this module, you will complete a series of hands-on labs that guide you through the essential stages of exploratory analysis.
  • Module 4: Predictive Analysis
    • In this module, you will apply regression modeling techniques to build predictive models for bike-sharing demand using the prepared dataset. Drawing on modeling concepts learned earlier, you will construct and refine multiple regression models to improve prediction accuracy. You will evaluate model performance using appropriate statistical metrics and interpret the contribution of different predictor variables. This stage represents the transition from data exploration to predictive analysis within your capstone workflow.
  • Module 5 - Building a R Shiny Dashboard App
    • In this module, you will apply your data visualization and application development skills to create an interactive dashboard that presents the results of your predictive analysis. Using R Shiny and visualization tools, you will design a dashboard that enables users to explore predicted bike-sharing demand across locations. This module focuses on transforming analytical results into interactive visual tools that support data-driven decision-making.
  • Module 6 - Present Your Data-Driven Insights
    • In this final module, you will consolidate the results of your capstone project into a professional presentation that communicates your workflow, analysis, insights, and predictive results. You will prepare a structured presentation that highlights the project problem, methodology, key findings, and conclusions. This module represents the culmination of your learning journey, where you demonstrate your ability to apply data science skills to solve a real-world problem and communicate your results effectively.

Taught by

Yan Luo

Reviews

4.6 rating at Coursera based on 112 ratings

Start your review of Data Science with R - Capstone Project

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.