Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

DataComPy - Dataframe Comparisons Made Explicit

PyCon US via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the capabilities of DataComPy, a popular validation tool with over 1.1 million monthly installations, in this 34-minute PyCon US talk sponsored by Capital One. Learn how this tool makes differences between schemas and data clear and explicit by generating comprehensive comparison reports between pairs of dataframes. Discover the useful metrics it provides, including match percentages, maximal differences, and sample mismatches between comparable columns. The presentation covers DataComPy's support for various input data types for both native comparisons (Pandas, Polars, Spark, Snowpark) and indirect comparisons (Dask, Ray), as well as its ability to compare certain database tables like Snowflake and DuckDB. Gain insights into how DataComPy works, its key features, and the practical use cases it addresses to enhance your data validation workflows.

Syllabus

DataComPy - Dataframe Comparisons made Explicit (Sponsor: Capital One)

Taught by

PyCon US

Reviews

Start your review of DataComPy - Dataframe Comparisons Made Explicit

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.