Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the evolution of Extract, Transform, Load (ETL) processes through this 28-minute conference talk that examines how branching and tagging capabilities in Apache Hive are revolutionizing data warehouse and data lake operations. Learn from Attila Turóczy, Senior Director of Engineering at Cloudera, as he demonstrates advanced ETL methodologies that leverage Hive's branching and tagging features to improve data pipeline management, version control, and workflow optimization. Discover how these innovative approaches enable more flexible data processing strategies, enhance data lineage tracking, and provide better isolation for development and production environments. Gain insights into practical implementation strategies for modernizing your ETL infrastructure using Apache Hive's latest capabilities, including best practices for managing complex data transformations and maintaining data quality across different branches and tagged versions.
Syllabus
The Future of ETL with Branching & Tagging in Apache Hive
Taught by
The ASF