Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

DataCamp

Cleaning Data in Java

via DataCamp

Overview

DataCamp Flash Sale:
50% Off - Build Data and AI Skills!
Grab it
Master data cleaning in Java using statistical methods, transformations, and validation for reliable apps.

Cleaning data is crucial for business problems. When data quality suffers, analytics become unreliable, machine learning models make poor predictions, and business decisions go awry.

This course equips you with Java tools to tackle data quality head-on. You'll learn statistical methods to spot outliers and handle missing values, master data transformations from standardizing text to managing dates across time zones, and implement range checks using regular expressions and validation annotations.

Working with Tablesaw, you'll clean real-world tabular data and perform transformations that prepare data for analysis. You'll finish ready to ensure data quality at every step of your applications.

Syllabus

  • Assessing Data Quality
    • Learn essential techniques for assessing data quality in Java applications. Discover how to use descriptive statistics to identify outliers, detect and handle missing values appropriately, and validate data types to prevent errors. Master key tools like DescriptiveStatistics for numerical analysis, Optional for null handling, and DateTimeFormatter for date validation.
  • Transforming Data
    • Master data transformation techniques for reliable Java applications. Learn to normalize strings using regular expressions for consistent text matching, standardize categories with EnumMap and HashMap for robust lookup tables, and handle date formats using Java's time API with LocalDate and ZoneId for consistent date handling across time zones.
  • Validating Data
    • Ensure data quality through validation techniques. Learn to implement range validation for numeric values and dates, master pattern validation using regular expressions to verify data formats, and apply constraint validation to enforce business rules.
  • Cleaning Tabular Data
    • Transform messy tabular data into clean, usable datasets with Tablesaw, a powerful Java library. You'll assess data quality, standardize column contents, and apply filtering operations to prepare your data. By the end, you'll confidently turn raw datasets into analysis-ready tables.

Taught by

Dennis Lee

Reviews

Start your review of Cleaning Data in Java

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.