Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to generate high-quality synthetic data that preserves privacy using differential privacy techniques in this hands-on webinar. Walk through training differentially private generative models with MOSTLY AI's open-source Synthetic Data SDK and explore how this method compares to traditional anonymization approaches in terms of both utility and risk. Gain practical insights into configuring privacy parameters, understanding the impact of privacy budgets, and evaluating synthetic data output through live demonstrations and real-world examples.
Master the core concepts of differential privacy and understand how it differs from traditional anonymization techniques like pseudonymization or masking. Install and configure the Synthetic Data SDK to generate synthetic datasets with differential privacy enabled, running the SDK in LOCAL mode using prepared datasets while exploring configuration options for privacy settings. Compare synthetic datasets generated with different privacy settings to understand how utility is impacted by stricter privacy budgets, evaluating the usefulness of differentially private synthetic data using predictive models and summary statistics.
Discover how to create hybrid datasets that combine real and synthetic data to retain utility while improving privacy, learning to use synthetic data to augment or replace sensitive parts of your dataset. Assess the fidelity of synthetic datasets using predictive and discriminative machine learning models, developing a strong understanding of privacy-utility trade-offs to confidently apply privacy-safe synthetic data in your own data science workflows.
Designed for data engineers, data scientists, ML engineers, and analysts with basic Python skills and familiarity with Jupyter Notebooks, this intermediate-level session requires a general understanding of machine learning workflows and working with tabular datasets, though no prior experience with synthetic data is necessary.
Syllabus
Differentially-Private Synthetic Data for Everyone with Dr. Michael Platzer
Taught by
Open Data Science