Statistical Modeling for Data Science Applications
University of Colorado Boulder via Coursera Specialization
-
10
-
- Write review
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Statistical modeling lies at the heart of data science. Well crafted statistical models allow data scientists to draw conclusions about the world from the limited information present in their data. In this three credit sequence, learners will add some intermediate and advanced statistical modeling techniques to their data science toolkit. In particular, learners will become proficient in the theory and application of linear regression analysis; ANOVA and experimental design; and generalized linear and additive models. Emphasis will be placed on analyzing real data using the R programming language.
This specialization can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder.
Logo adapted from photo by Vincent Ledvina on Unsplash
Syllabus
- Course 1: Modern Regression Analysis in R
- Course 2: ANOVA and Experimental Design
- Course 3: Generalized Linear Models and Nonparametric Regression
Courses
-
This second course in statistical modeling will introduce students to the study of the analysis of variance (ANOVA), analysis of covariance (ANCOVA), and experimental design. ANOVA and ANCOVA, presented as a type of linear regression model, will provide the mathematical basis for designing experiments for data science applications. Emphasis will be placed on important design-related concepts, such as randomization, blocking, factorial design, and causality. Some attention will also be given to ethical issues raised in experimentation. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder. Logo adapted from photo by Vincent Ledvina on Unsplash
-
In the final course of the statistical modeling for data science program, learners will study a broad set of more advanced statistical modeling tools. Such tools will include generalized linear models (GLMs), which will provide an introduction to classification (through logistic regression); nonparametric modeling, including kernel estimators, smoothing splines; and semi-parametric generalized additive models (GAMs). Emphasis will be placed on a firm conceptual understanding of these tools. Attention will also be given to ethical issues raised by using complicated statistical models. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder. Logo adapted from photo by Vincent Ledvina on Unsplash
-
This course will provide a set of foundational statistical modeling tools for data science. In particular, students will be introduced to methods, theory, and applications of linear statistical models, covering the topics of parameter estimation, residual diagnostics, goodness of fit, and various strategies for variable selection and model comparison. Attention will also be given to the misuse of statistical models and ethical implications of such misuse. This course can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder. Logo adapted from photo by Vincent Ledvina on Unsplash
Taught by
Brian Zaharatos