Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CodeSignal

Handling Unbalanced Datasets

via CodeSignal

Overview

In this course, you'll learn to recognize and address class imbalance in datasets. Explore practical undersampling and oversampling techniques, visualize their effects, and apply advanced resampling strategies. By the end, you'll be able to train models that perform better on imbalanced data.

Syllabus

  • Unit 1: Identifying and Understanding Data Imbalance
    • Counting Classes to Spot Imbalance
    • Visualizing Imbalance with Bar Plots
    • Quantifying Imbalance with Class Percentages
  • Unit 2: Undersampling Techniques for Handling Unbalanced Datasets
    • Balancing Classes with Random Undersampling
    • Cleaning Boundaries with Tomek Links
    • Comparing Undersampling Techniques Side by Side
    • Applying Undersampling to Real Data
  • Unit 3: Oversampling Techniques for Handling Unbalanced Datasets
    • Counting the Balance in Random Oversampling
    • Custom SMOTE for Partial Rebalancing
    • Fine-Tuning ADASYN for Subtle Rebalancing
    • SMOTE with Real World Data
  • Unit 4: Training a Better Model with Resampling Techniques
    • Training a Baseline Logistic Regression Model
    • Building a Resampling Pipeline
    • Training Models with Resampled Data

Reviews

Start your review of Handling Unbalanced Datasets

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.