Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CodeSignal

Advanced Data Cleaning: Handling Text Data with Python

via CodeSignal

Overview

This course extends data cleaning techniques to handle text-based data in tabular datasets. It covers cleaning and processing text columns, dealing with mixed data types, extracting meaningful features from text, and preparing text data for machine learning.

Syllabus

  • Unit 1: Handling Text Columns in Tabular Data with Python
    • Refine Your Text Cleaning Skills
    • Fix Numeric Values in Text Data
    • Standardize Synonyms for Clean Data
    • Track and Log Data Transformations
  • Unit 2: Removing Special Characters and Normalizing Text Using Python
    • Enhance Text Readability with Python
    • Fix the Text Formatting Issue
    • Enhance Text with Unicode Normalization
    • Logging Special Character Removal
  • Unit 3: Handling Mixed Data Types in Columns Using Python
    • Transform Percentages in Data Columns
    • Debugging Accounting Format in Data
    • Log and Handle Non-numeric Entries
    • Categorize Prices with Python Functions
  • Unit 4: Preparing Text Data for Machine Learning Using Python
    • Bigrams in Feature Extraction
    • Enhance Text Preprocessing Skills
    • Enhance Text with N-Gram TF-IDF

Reviews

Start your review of Advanced Data Cleaning: Handling Text Data with Python

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.