Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Explore Raw Data

Google via Coursera

Go to class Write review

Details

Go to class

Provider

Coursera
Pricing

Paid Course
Languages

English
Certificate

Certificate Available
Effort

7 hours 1 minute
Sessions

Self-Paced
Level

Intermediate
Subtitles

English

Found in

Part of

Google Data Analysis with Python

Overview

Google, IBM & Meta Certificates – 40% Off

One plan covers every Professional Certificate on Coursera.

Unlock All Certificates

Finding stories in data using exploratory data analysis (EDA) is all about organizing and interpreting raw data. Python can help you do this quickly and effectively. In this course, you’ll learn how to use Python to perform the EDA practices of discovering and structuring. By the end of this course, you will be able to: • Identify ethical issues that may come up during the data “discovering” practice of EDA • Use Python to merge or join data based on defined criteria • Use Python to sort and/or filter data • Use relevant Python libraries for cleaning raw data • Recognize opportunities for creating hypotheses based on raw data • Recognize when and how to communicate status updates and questions to key stakeholders • Apply Python tools to examine raw data structure and format. • Use the PACE workflow to understand whether given data is adequate and applicable to a data science project • Differentiate between the common formats of raw data sources (json, tabular, etc.) and data types

Syllabus

"Discovering" is the beginning of an investigation

Data professionals must understand data sources, file formats, and responsible parties during exploratory analysis. In this module, you will learn when to contact data owners for questions or issues, how to import data using Python and perform EDA using basic functions in Python.

Understand data format

EDA discovery uses targeted questioning to identify data gaps and missing information. In this module, you will learn how to formulate hypotheses, manipulate datetime strings and create bar graph visualizations.

Create structure from raw data

Structuring is an EDA practice for organizing data to learn more about it. In this module, you will learn different types of structuring methods, pandas tools for structuring datasets, and interpret histograms to understand data distributions.