The Fastest Way to Become a Backend Developer Online
AI Engineer - Learn how to integrate AI into software applications
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This 24-minute conference talk explores the misconception that all data processing requires distributed systems like Spark. Learn why processing gigabytes or even hundreds of megabytes of data with heavyweight distributed frameworks is often unnecessary and wasteful. Discover how advancements in memory density and CPU performance, combined with efficient data engines like DuckDB, distributed storage, and increased bandwidth, enable doing more with less in today's post-ZIRP economy. Explore the benefits of small data approaches, understand why more data doesn't always equal better results, and see how single machines can provide efficient, powerful solutions with the advantage of local development simplicity.
Syllabus
Data infrastructure to build bigger with less
Taught by
Open Data Science