Running Remote Shuffle Service to Solve Apache Spark's Dynamic Resource Allocation Challenge on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Lead AI-Native Products with Microsoft's Agentic AI Program
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore a novel solution to Apache Spark's dynamic resource allocation (DRA) challenge on Kubernetes using an open-source remote shuffle service (RSS). Gain insights into Spark's DRA in Kubernetes, learn how the RSS alleviates resource contention issues, and discover a more reliable and scalable solution for big data processing. Understand how offloading shuffle data to remote storage outside Spark's executor pods can decouple storage and compute, supporting dynamic scaling needs. Delve into the implementation details, benefits, and potential impact of this approach for efficient large-scale data processing in machine learning and ETL use cases.
Syllabus
Running Remote Shuffle Service to Solve a Well-Known Challenge for Apa... Melody Yang & Keyong Zhou
Taught by
CNCF [Cloud Native Computing Foundation]