Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the capabilities of DataComPy, a popular validation tool with over 1.1 million monthly installations, in this 34-minute PyCon US talk sponsored by Capital One. Learn how this tool makes differences between schemas and data clear and explicit by generating comprehensive comparison reports between pairs of dataframes. Discover the useful metrics it provides, including match percentages, maximal differences, and sample mismatches between comparable columns. The presentation covers DataComPy's support for various input data types for both native comparisons (Pandas, Polars, Spark, Snowpark) and indirect comparisons (Dask, Ray), as well as its ability to compare certain database tables like Snowflake and DuckDB. Gain insights into how DataComPy works, its key features, and the practical use cases it addresses to enhance your data validation workflows.