Boosting Presto Query Speeds with Hudi's Metadata Table and Clustering Service
Apache Hudi via YouTube
The Most Addictive Python and SQL Courses
AI Adoption - Drive Business Value and Organizational Impact
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to optimize Presto query performance using Apache Hudi's advanced data management features in this 53-minute technical talk. Discover how the clustering service automatically organizes and sizes data files for optimal retrieval, moving beyond traditional time-based ingestion to co-locate frequently accessed data. Explore how Hudi's metadata table eliminates performance bottlenecks by efficiently maintaining file listings for cloud object stores like AWS S3, avoiding expensive listing operations. Master techniques for achieving faster query speeds through the combination of Hudi's clustering and metadata capabilities with Presto, enabling high-performance interactive analytics on large-scale data lakes.
Syllabus
Boost Presto query speeds with Hudi's metadata table & clustering service
Taught by
Apache Hudi