Build on basic regression skills to develop more sophisticated prediction models. Learn to combine multiple factors, incorporate categorical variables, properly clean and prepare data, validate models, and engineer custom features to improve predictions.
Overview
Syllabus
- Unit 1: Predicting Insurance Costs with Multiple Regression: Age and BMI Analysis
- Comparing Single vs Multi-Factor Prediction Models
- Extracting Meaning from Model Coefficients
- Residual Analysis Reveals Model Weaknesses
- Unit 2: Incorporating Categorical Variables into Insurance Cost Models
- Comparing OneHot Encoding Drop Strategies
- Flexible Preprocessing Pipeline Configurations
- Pipeline Performance Comparison and Analysis
- Pipeline Inspection and Debugging Workflow
- Unit 3: Cleaning PredictHealth's Customer Database
- Inspecting and Filling Missing Data
- Completing the Missing Value Cleanup
- Building Your Outlier Detection Function
- Scaling Features with MinMax Normalization
- Unit 4: PredictHealth's Model Validation Strategy: Data Splitting, Evaluation, and Visualization
- Splitting Data for Model Validation
- Fixing Your Broken Preprocessing Pipeline
- Implementing Cross Validation for Robust Assessment
- Completing Your Model Performance Visualizations
- Unit 5: Creating PredictHealth's Custom Predictors: Feature Engineering for Better Insurance Cost Models
- Creating Your First Age Groups
- Visualizing Raw vs Engineered Features
- Building Your Complete Feature Pipeline
- Building Your First Predictive Model