Building Reproducible ML Processes with an Open Source Stack
Toronto Machine Learning Series (TMLS) via YouTube
Get 20% off all career paths from fullstack to AI
Google, IBM & Microsoft Certificates — All in One Plan
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to create truly reproducible machine learning experiments in this 31-minute conference talk from the Toronto Machine Learning Series. Explore the essential components of reproducible ML processes, including MLFlow Projects for code reproducibility, lakeFS for data versioning, and Infrastructure-as-code for environment consistency. Follow along with a practical code demonstration that showcases how to recreate experiments using identical input data, code, and processing environments from previous runs. Master techniques for creating data snapshots through commits, implementing effective tagging systems, and managing the synchronized history of both code and data components using an open-source technology stack.
Syllabus
Building Reproducible ML Processes with an Open Source Stack
Taught by
Toronto Machine Learning Series (TMLS)