How to Build a Data Pipeline Using Synthetic Data Generation and Testing with Python
PyCon South Africa via YouTube
The Fastest Way to Become a Backend Developer Online
Power BI Fundamentals - Create visualizations and dashboards from scratch
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how to overcome data pipeline development challenges when real data is unavailable through this 31-minute conference talk from PyCon South Africa. Discover practical techniques for generating and utilizing synthetic data with Python, including statistical methods and packages like Faker and SDV to create realistic test data for customer profiles, transactions, and time series. Explore how to implement Flyway for loading synthetic data into Postgres databases and managing repeatable deployments. Gain valuable insights into best practices, benefits, and potential challenges of synthetic data testing through code examples and live demonstrations. Designed for intermediate Python developers, master the essential skills needed to build and validate robust data pipelines without requiring access to actual production data.
Syllabus
Time: Oct 05 Thu:
Duration:
Taught by
PyCon South Africa