PowerBI Data Analyst - Create visualizations and dashboards from scratch
Get 35% Off CFI Certifications - Code CFI35
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to optimize Presto performance through distributed caching in this technical talk that addresses common challenges faced when working with cloud storage systems like S3. Explore solutions for slow query performance and high API costs through detailed explanations of distributed caching design patterns and real-world implementations. Discover advanced techniques including segmented data file caching, soft-affinity scheduler policies, cache filtering, TTL, and customized eviction strategies. Examine case studies from major technology companies like Meta, Uber, ByteDance, and Newsbreak to understand how they successfully implemented caching to optimize interactive queries, maximize hit rates, reduce cloud storage costs, and improve query performance. Master practical implementation strategies for setting up caching systems and measuring performance improvements using TPC-DS benchmark results.
Syllabus
Presto Optimization with Distributed Caching on Data Lake - Hope Wang & Beinan Wang, Alluxio
Taught by
Presto Foundation