Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

ETL at Scale - Optimizing PySpark and Airflow Workflows

Conf42 via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to optimize large-scale ETL workflows using PySpark and Apache Airflow in this 18-minute conference talk from Conf42 Kube Native 2025. Discover cloud-native ETL fundamentals and understand the key challenges faced when scaling data processing pipelines. Explore techniques for building stable foundations for your ETL infrastructure and dive deep into Spark performance optimization strategies. Master Airflow orchestration techniques to efficiently manage complex data workflows and implement comprehensive monitoring and observability practices. Follow a practical implementation roadmap that guides you through deploying optimized ETL solutions at scale, complete with real-world examples and best practices for production environments.

Syllabus

00:00 Introduction to Cloud Native ETL
01:40 ETL Scaling Challenges
03:08 Building a Stable Foundation
04:45 Optimizing Spark Performance
07:32 Airflow Orchestration Techniques
11:50 Monitoring and Observability
13:08 Implementation Roadmap
16:50 Key Takeaways and Conclusion

Taught by

Conf42

Reviews

Start your review of ETL at Scale - Optimizing PySpark and Airflow Workflows

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.