Identify impactful features, reduce dimensionality, and streamline datasets for analysis. Learn techniques to enhance model efficiency and performance by focusing on the most relevant data attributes.
Overview
Syllabus
- Unit 1: Data Preparation for Feature Selection
- Fill Missing Values in Titanic
- Drop Column With Excessive Missing Values
- Encode Categorical Data Efficiently
- Confirm Your Data Preparation Steps
- Unit 2: Feature Selection with Statistical Tests
- Defining Features and Target Variable
- Adjusting Feature Selection Parameters
- Explore Mutual Information for Feature Selection
- Selecting Top Features with Chi Square
- Evaluate Feature Significance with Chi-Square
- Unit 3: Feature Ranking with Random Forests
- Adjusting Random Forest Parameters
- Debug Feature Ranking Code
- Train and Rank Features
- Feature Importance with Random Forests
- Unit 4: Dimensionality Reduction with PCA
- Explained Variance with PCA Analysis
- Exploring PCA Without Scaling
- Enhance Your PCA Skills
- Creating a DataFrame with PCA
- Unit 5: Automating Feature Engineering with Pipelines
- Build a Pipeline in Python
- Accessing PCA Explained Variance
- Fix the Pipeline Missing Step
- Enhance Pipelines with SelectKBest