Learn EDR Internals: Research & Development From The Masters
2,000+ Free Courses with Certificates: Coding, AI, SQL, and More
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the evolution of Extract, Transform, Load (ETL) processes through this 28-minute conference talk that examines how branching and tagging capabilities in Apache Hive are revolutionizing data warehouse and data lake operations. Learn from Attila Turóczy, Senior Director of Engineering at Cloudera, as he demonstrates advanced ETL methodologies that leverage Hive's branching and tagging features to improve data pipeline management, version control, and workflow optimization. Discover how these innovative approaches enable more flexible data processing strategies, enhance data lineage tracking, and provide better isolation for development and production environments. Gain insights into practical implementation strategies for modernizing your ETL infrastructure using Apache Hive's latest capabilities, including best practices for managing complex data transformations and maintaining data quality across different branches and tagged versions.
Syllabus
The Future of ETL with Branching & Tagging in Apache Hive
Taught by
The ASF