Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Feature Selection for Machine Learning

via Train in Data

Overview

The most comprehensive online course on feature selection for machine learning. You will learn multiple feature selection methods to select the best features in your data set and build simpler, faster, and more reliable machine learning models.

Create simpler, faster and more reliable machine learning models.

If you're disappointed for whatever reason, you'll get a full refund.

Sole is a lead data scientist, instructor and developer of open source software. She created and maintains the Python library Feature-engine, which allows us to impute data, encode categorical variables, transform, create and select features. Sole is also the author of the"Python Feature engineering Cookbook" by Packt editorial.

Welcome to Feature Selection for Machine Learning, the most comprehensive course on feature selection available online.

In this course, you will learn multiple feature selection methods to select the best features in your data set and build simpler, faster, and more reliable machine learning models.

Feature selection is the process of identifying and selecting a subset of features from the original data set to use as inputs in a machine learning algorithm.

Data sets usually contain a large number of features. We can use multiple algorithms to quickly disregard irrelevant features and identify those important features in our data.

Feature selection algorithms can be divided into 1 of 3 categories: filter methods, wrapper methods, and embedded methods.

Filter methods comprise basic data preprocessing steps to remove constant and duplicated features and statistical tests to assert feature importance. Wrapper methods wrap the search around the estimator. They use backward and forward selection to examine and identify the best set of features. Embedded methods combine feature selection with the fitting of the classifier or regression model.

Feature selection is key to creating easier to interpret and faster models, as well as to avoiding overfitting. When creating machine learning models to use in the real-world, feature selection is an integral part of the machine learning pipeline.

In this course, you will learn multiple feature selection techniques, gathered from scientific articles, data science competitions and my experience as a data scientist, to identify relevant features in your data sets.

You will learn the following filter methods:

You will learn the following wrapper methods:

You will learn the following embedded methods:

You will learn the following hybrid methods:

Syllabus

  •   Welcome
    • Introduction
    • Course curriculum overview
    • Course requirements
    • Course aim
    • How did you hear about us?
    • Refer a friend program
    • Resources for feature engineering
  •   Course material
    • Course material
    • Download Jupyter notebooks
    • Download datasets
    • Download presentations
  •   Feature selection
    • What is feature selection?
    • Feature selection methods | Overview
    • Filter methods
    • Wrapper methods
    • Embedded methods
    • Moving Forward
    • Open-source packages for feature selection
    • Quiz
    • Reading resources
  •   Filter Methods | Basic
    • Constant, quasi constant, and duplicated features – Intro
    • Constant features
    • Quasi-constant features
    • Install Feature-engine
    • Drop constant and quasi-constant with Feature-engine
    • Duplicated features
    • Drop duplicates with Feature-engine
  •   Filter methods | Correlation
    • Correlation - Intro
    • Correlation Feature Selection
    • Correlation procedures to select features
    • Correlation | Notebook demo
    • Basic methods plus Correlation pipeline
    • Correlation with Feature-engine
    • Feature Selection Pipeline with Feature-engine
    • Categorical variables and correlation
    • Additional reading resources
    • Added Treat: A Movie We Recommend 🍿
  •   Filter methods | Statistical tests
    • Statistical tests for feature selection – intro
    • Statistical tests for feature selection - characteristics
    • Mutual information
    • MI for continuous variables
    • Select features with MI
    • Mutual information demo
    • Chi-square test
    • Chi-square | Demo
    • Chi-square considerations
    • Chi2 - calculating the expected frequencies (optional)
    • Chi-square quiz
    • Anova
    • Anova | Demo
    • Correlation with the target
    • Correlation with the target - demo
    • Select features based of p-values
    • Basic methods + Correlation + Filter with stats pipeline
    • Reading resources
  •   Filter Methods | Other methods and metrics
    • Filter Methods with other metrics
    • Univariate model performance metrics
    • Univariate model performance metrics | Demo
    • Univariate model performance with Feature-engine
    • KDD 2009: Select features by target mean encoding
    • KDD 2009: Select features by mean encoding | Demo
    • Target Mean Encoding Selection with Feature-engine
    • Reading resources
    • Extra Treat: Our Reading Suggestion 📕
  •   Wrapper methods
    • Wrapper methods – Intro
    • MLXtend
    • Step forward feature selection
    • SFS - MLXtend vs Sklearn
    • Step forward feature selection | MLXtend
    • Step forward feature selection | sklearn
    • Step backward feature selection
    • Step backward feature selection | MLXtend
    • Step backward feature selection | Sklearn
    • Exhaustive search
    • Exhaustive search | Demo
  •   Embedded methods | Linear models
    • Regression Coefficients – Intro
    • Selection by Logistic Regression Coefficients
    • Selection by Linear Regression Coefficients
    • Coefficients change with penalty
    • Basic methods + Correlation + Embedded method using coefficients
    • More Wisdom: Our Chosen Podcast Episode 🎧
  •   Embedded methods – Lasso regularisation
    • Regularisation – Intro
    • Lasso
    • A note on SelectFromModel
    • Basic filter methods + LASSO pipeline
  •   Embedded methods | Trees
    • Feature Selection by Tree importance | Intro
    • Feature Selection by Tree importance | Demo
    • Feature Selection by Tree importance | Recursively
    • Feature selection with decision trees | review
  •   Hybrid feature selection methods
    • Introduction to hybrid methods
    • Feature Shuffling - Intro
    • Shuffling features | Demo
    • Recursive feature elimination - Intro
    • Recursive feature elimination | Demo
    • Recursive feature addition - Intro
    • Recursive feature addition | Demo
    • Feature Shuffling with Feature-engine
    • Recursive feature elimination with Feature-engine
    • Recursive feature addition with Feature-engine
  •   Congratulations! You did it!
    • Congratulations
    • Additional reading resources
    • Next steps

Taught by

Soledad Galli

Reviews

Start your review of Feature Selection for Machine Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.