Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to prepare and clean bioactivity data for aromatase inhibitors in Python using the RDKit library in this 21-minute tutorial. Master the process of loading data from the CHEMBL database, removing duplicate molecules, and standardizing SMILES notations (a string representation of a molecule) to create a high-quality, non-redundant dataset suitable for machine learning model building and further data analysis. Follow along with the provided GitHub code repository to implement data preprocessing techniques essential for bioinformatics research and computational drug discovery projects.
Syllabus
Bioinformatics Project from Scratch PART 2 - Preparing the Data Set
Taught by
Data Professor