Bridging Big Data and Machine Learning Ecosystems - A Cloud Native Approach Using Kubeflow
CNCF [Cloud Native Computing Foundation] via YouTube
Stuck in Tutorial Hell? Learn Backend Dev the Right Way
Google AI Professional Certificate - Learn AI Skills That Get You Hired
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to bridge the gap between big data systems and machine learning frameworks using cloud-native technologies in this 33-minute conference talk from CNCF. Explore the critical challenges of minimizing data movement and serialization overhead when connecting scalable big data systems like Apache Spark and Iceberg with machine learning frameworks such as PyTorch. Discover how traditional workflows create costly bottlenecks through data serialization between storage formats like Parquet/Iceberg and training frameworks, leading to inefficient resource utilization in distributed training environments. Examine a comprehensive cloud-native solution that leverages Kubeflow for end-to-end machine learning orchestration and Apache Arrow for high-performance data interchange, enabling seamless integration of analytics and ML workflows while optimizing performance and resource efficiency.
Syllabus
Bridging Big Data and Machine Learning Ecosystems: A Cloud Native Approac... Johnu George & Shiv Jha
Taught by
CNCF [Cloud Native Computing Foundation]