Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
In the AI for Scientific Research specialization, we'll learn how to use AI in scientific situations to discover trends and patterns within datasets. Course 1 teaches a little bit about the Python language as it relates to data science. We'll share some existing libraries to help analyze your datasets. By the end of the course, you'll apply a classification model to predict the presence or absence of heart disease from a patient's health data. Course 2 covers the complete machine learning pipeline, from reading in, cleaning, and transforming data to running basic and advanced machine learning algorithms.In the final project, we'll apply our skills to compare different machine learning models in Python. In Course 3, we will build on our knowledge of basic models and explore more advanced AI techniques. We’ll describe the differences between the two techniques and explore how they differ. Then, we’ll complete a project predicting similarity between health patients using random forests. In Course 4, a capstone project course, we'll compare genome sequences of COVID-19 mutations to identify potential areas a drug therapy can look to target. By the end, you'll be well on your way to discovering ways to combat disease with genome sequencing.
Syllabus
- Course 1: Introduction to Data Science and scikit-learn in Python
- Course 2: Machine Learning Models in Science
- Course 3: Neural Networks and Random Forests
- Course 4: Capstone Project: Advanced AI for Drug Discovery
Courses
-
This course is aimed at anyone interested in applying machine learning techniques to scientific problems. In this course, we'll learn about the complete machine learning pipeline, from reading in, cleaning, and transforming data to running basic and advanced machine learning algorithms. We'll start with data preprocessing techniques, such as PCA and LDA. Then, we'll dive into the fundamental AI algorithms: SVMs and K-means clustering. Along the way, we'll build our mathematical and programming toolbox to prepare ourselves to work with more complicated models. Finally, we'll explored advanced methods such as random forests and neural networks. Throughout the way, we'll be using medical and astronomical datasets. In the final project, we'll apply our skills to compare different machine learning models in Python.
-
In this course, we will build on our knowledge of basic models and explore advanced AI techniques. We’ll start with a deep dive into neural networks, building our knowledge from the ground up by examining the structure and properties. Then we’ll code some simple neural network models and learn to avoid overfitting, regularization, and other hyper-parameter tricks. After a project predicting likelihood of heart disease given health characteristics, we’ll move to random forests. We’ll describe the differences between the two techniques and explore their differing origins in detail. Finally, we’ll complete a project predicting similarity between health patients using random forests.
-
In this capstone project course, we'll compare genome sequences of COVID-19 mutations to identify potential areas a drug therapy can look to target. The first step in drug discovery involves identifying target subsequences of theirs genome to target. We'll start by comparing the genomes of virus mutations to look for similarities. Then, we'll perform PCA to cut down our number of dimensions and identify the most common features. Next, we'll use K-means clustering in Python to find the optimal number of groups and trace the lineage of the virus. Finally, we'll predict similarity between the sequences and use this to pick a target subsequence. Throughout the course, each section will consist of a programming assignment coupled with a guide video and helpful hints. By the end, you'll be well on your way to discovering ways to combat disease with genome sequencing.
-
This course will teach you how to leverage the power of Python and artificial intelligence to create and test hypothesis. We'll start for the ground up, learning some basic Python for data science before diving into some of its richer applications to test our created hypothesis. We'll learn some of the most important libraries for exploratory data analysis (EDA) and machine learning such as Numpy, Pandas, and Sci-kit learn. After learning some of the theory (and math) behind linear regression, we'll go through and full pipeline of reading data, cleaning it, and applying a regression model to estimate the progression of diabetes. By the end of the course, you'll apply a classification model to predict the presence/absence of heart disease from a patient's health data.
Taught by
Neelesh Tiruviluamala, Rajvir Dua and Sabrina Moore