The Fastest Way to Become a Backend Developer Online
Become an AI & ML Engineer with Cal Poly EPaCE — IBM-Certified Training
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore engineering strategies for optimizing data science workflows using Spark in this 40-minute conference talk from the Data Science Festival Summer School 2023. Dive into a case study presented by Neil McCulloch, Data Science Engineer at dunnhumby, focusing on improving the performance of problematic PySpark applications. Learn how to slash runtimes in half for in-store availability reporting science. Gain insights into tackling large-scale data processing challenges and enhancing the efficiency of Spark-based data science projects. Discover practical approaches to optimize PySpark applications and streamline big data analytics workflows.
Syllabus
Spark at Scale: Engineering Strategies for Data Science Workflows
Taught by
Data Science Festival