Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn the foundations of data science by exploring, transforming, and visualizing data with R. In this course, you’ll develop core skills in exploratory data analysis and statistical thinking including: using visualizations to uncover patterns, identifying trends, and generating insights.
You’ll gain hands-on experience with Tidyverse packages in R, work in RStudio, and create reproducible reports with Quarto. Along the way, you’ll also learn version control practices with Git and GitHub to document and share your work.
By the end of this course, you’ll be able to transform and summarize data, craft clear and informative graphics, and communicate your findings through professional, reproducible workflows - laying the groundwork for all your future data science projects.
Syllabus
- Hello World
- Hello World! In the first module, you will learn about what data science is and how data science techniques are used to make meaning from data and inform data-driven decisions. There is also discussion around the importance of reproducibility in science and the techniques used to achieve this. Next, you will learn the technology languages of R, RStudio, Quarto, and GitHub, as well as their role in data science and reproducibility.
- Data and visualization
- In our second module, we'll advance our understanding of R to set the stage for creating data visualizations using tidyverse’s data visualization package: ggplot2. We'll learn all about different data types and the appropriate data visualization techniques that can be used to plot these data. The majority of this module is to help best understand ggplot2 syntax and how it relates to the Grammar of Graphics. By the end of this module, you will have started building up the foundation of your statistical tool-kit needed to create basic data visualizations in R.
- Visualizing, transforming, and summarizing types of data
- In this module, we will take a step back and learn about tools for transforming data that might not yet be ready for visualization as well as for summarizing data with tidyverse’s data wrangling package: dplyr. In addition to describing distributions of single variables, you will also learn to explore relationships between two or more variables. Finally, you will continue to hone your data visualization skills with plots for various data types.
Taught by
Mine Çetinkaya-Rundel and Dr. Elijah Meyer