Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

LinkedIn Learning

Data Cleaning in Python Essential Training

via LinkedIn Learning

Write review

Overview

Build a Learning Habit
Download Class Central's free printable study calendar
Download for Free
Improve the overall analytic workflow of your organization by boosting your data cleaning skills in Python.

Syllabus

Introduction
  • Why is clean data important?
  • Using Codespaces
1. Bad Data
  • Types of errors
  • Missing values
  • Bad values
  • Duplicates
2. Causes of Errors
  • Human errors
  • Machine errors
  • Design errors
  • Challenge: UI design
  • Solution: UI design
3. Detecting Errors
  • Schemas
  • Validation
  • Finding missing data
  • Domain knowledge
  • Subgroups
  • Using Copilot to build schema
  • Challenge: Find bad data
  • Solution: Find bad data
4. Preventing Errors
  • Serialization formats
  • Digital signature
  • Data pipelines and automation
  • Transactions
  • Data organization and tidy data
  • Process and data quality metrics
  • Challenge: ETL
  • Solution: ETL
5. Fixing Errors
  • Renaming fields
  • Fixing types
  • Joining and splitting data
  • Deleting bad data
  • Filling missing values
  • Reshaping data
  • Challenge: Workshop earnings
  • Solution: Workshop earnings
Conclusion
  • Next steps

Taught by

Miki Tebeka

Reviews

4.4 rating at LinkedIn Learning based on 417 ratings

Start your review of Data Cleaning in Python Essential Training

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.