Boosting Presto Query Speeds with Hudi's Metadata Table and Clustering Service
Apache Hudi via YouTube
Learn Generative AI, Prompt Engineering, and LLMs for Free
Python, Prompt Engineering, Data Science — Build the Skills Employers Want Now
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to optimize Presto query performance using Apache Hudi's advanced data management features in this 53-minute technical talk. Discover how the clustering service automatically organizes and sizes data files for optimal retrieval, moving beyond traditional time-based ingestion to co-locate frequently accessed data. Explore how Hudi's metadata table eliminates performance bottlenecks by efficiently maintaining file listings for cloud object stores like AWS S3, avoiding expensive listing operations. Master techniques for achieving faster query speeds through the combination of Hudi's clustering and metadata capabilities with Presto, enabling high-performance interactive analytics on large-scale data lakes.
Syllabus
Boost Presto query speeds with Hudi's metadata table & clustering service
Taught by
Apache Hudi