Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
In our increasingly interconnected world, we’re collecting more raw data than ever. In “More Applied Data Science with Python,” you’ll learn how to extract and analyze complex data sets using Python. Practice using real-world data sets, like health data and comment sections, to develop visual representations and identify key patterns amongst populations. You’ll also learn to manage missing and messy data using advanced manipulation methods. Throughout this course series, you’ll build a foundation for advanced analytics and machine learning with the help of Scikit-Learn and NLP libraries by applying methods for data mining, clustering, topic modeling, network modeling, and information extraction. Upon completing the series, you'll have gained advanced data analysis skills that will help you gain insights into the datasets you're exploring.
Learners should have intermediate Python programming skills before enrolling in the Specialization. It is encouraged that you complete Applied Data Science with Python prior to beginning this Specialization.
Syllabus
- Course 1: Data Mining in Python
- Course 2: Applied Unsupervised Learning in Python
- Course 3: Network Modeling and Analysis in Python
- Course 4: Applied Information Extraction in Python
Courses
-
In “Applied Information Extraction in Python,” you will learn how to extract useful information from free-text data, which is a type of string data created when people type. Examples of free-text data include names of people or organizations, location information such as cities and zip codes, or other elements like stock prices or clinical diagnoses. Free-text data is found everywhere, from magazine articles to social media posts, and can be complex to analyze. In this course, you’ll use applied machine learning and text-mining techniques to analyze free-text data. You will learn how to identify named entities and tag them with appropriate types of classifications, using real-world data from business, politics, and healthcare. You’ll develop multiple approaches to recognize and extract named entities and attributes of interest from free-text data, ranging from regular expressions to neural network models. Finally, you’ll explore Transformer models such as ChatGPT and Large Language Models to extract information from large datasets. This is the final course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the following courses from the Applied Data Science with Python Specialization: Introduction to Data Science in Python, Applied Machine Learning in Python, and Applied Text Mining in Python.
-
In “Applied Unsupervised Learning in Python,” you will learn how to use algorithms to find interesting structure in datasets. You will practice applying, interpreting, and refining unsupervised machine learning models to solve a diverse set of problems on real-world datasets. This course will show you how to explore unlabelled data using several techniques: dimensionality reduction and manifold learning for condensing and visualizing high-dimensional data, clustering to reveal interesting groups and outliers, topic modeling for summarizing important themes in text, methods for dealing with missing data, and more. This course also covers best practices associated with different techniques, as well as demonstrating how unsupervised learning can be used to improve supervised prediction. This is the second course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.
-
In “Data Mining in Python,” you will learn how to extract useful knowledge from large-scale datasets. This course introduces basic concepts and general tasks for data mining. You will explore a wide range of real-world data sets, including grocery store, restaurant reviews, business operations, social media posts, and more. You will learn how to formally describe real-world information with general data representations (e.g., itemsets, vectors, matrices, sequences, and more). You will then learn how to formulate data in the wild with one or more of these representations. This course will teach you how to characterize and explain your data by looking for patterns and similarities, which are basic building blocks for advanced analysis and machine learning models. This is the first course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.
-
In “Network Modeling and Analysis in Python,” you will learn how different types of network analysis can be used to make sense of complex systems. You’ll learn how algorithms can be used to better understand disease epidemics, human community structure, and the flow of information on social media. This course combines network theory with empirical analysis of real-world networks using the Python library NetworkX. You’ll learn about community structure in networks as well as several popular algorithms for community detection and applications. This course introduces a wide range of advanced network models. You’ll study random network generation models and how they can be used to create realistic graphs and explain how networks function. You’ll also learn about models that explain diffusion and the spread of epidemics in networks, such as the SI, SIS, SIR, independent cascade, and linear threshold models. This is the third course in “More Applied Data Science with Python,” a four-course series focused on helping you apply advanced data science techniques using Python. It is recommended that all learners complete the Applied Data Science with Python specialization prior to beginning this course.
Taught by
Daniel Romero, Kevyn Collins-Thompson, Qiaozhu Mei and VG Vinod Vydiswaran