Finding stories in data using exploratory data analysis (EDA) is all about organizing and interpreting raw data. Python can help you do this quickly and effectively. In this course, you’ll learn how to use Python to perform the EDA practices of discovering and structuring.
By the end of this course, you will be able to:
• Identify ethical issues that may come up during the data “discovering” practice of EDA
• Use Python to merge or join data based on defined criteria
• Use Python to sort and/or filter data
• Use relevant Python libraries for cleaning raw data
• Recognize opportunities for creating hypotheses based on raw data
• Recognize when and how to communicate status updates and questions to key stakeholders
• Apply Python tools to examine raw data structure and format.
• Use the PACE workflow to understand whether given data is adequate and applicable to a data science project
• Differentiate between the common formats of raw data sources (json, tabular, etc.) and data types
Overview
Syllabus
- "Discovering" is the beginning of an investigation
- Data professionals must understand data sources, file formats, and responsible parties during exploratory analysis. In this module, you will learn when to contact data owners for questions or issues, how to import data using Python and perform EDA using basic functions in Python.
- Understand data format
- EDA discovery uses targeted questioning to identify data gaps and missing information. In this module, you will learn how to formulate hypotheses, manipulate datetime strings and create bar graph visualizations.
- Create structure from raw data
- Structuring is an EDA practice for organizing data to learn more about it. In this module, you will learn different types of structuring methods, pandas tools for structuring datasets, and interpret histograms to understand data distributions.
- Review: Explore raw data
- Review everything you’ve learned and take the final assessment.
Taught by
Google Career Certificates