Bridging Big Data and Machine Learning Ecosystems - A Cloud Native Approach Using Kubeflow
CNCF [Cloud Native Computing Foundation] via YouTube
Learn the Skills Netflix, Meta, and Capital One Actually Hire For
The Fastest Way to Become a Backend Developer Online
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how to bridge the gap between big data systems and machine learning frameworks using cloud-native technologies in this 33-minute conference talk from CNCF. Explore the critical challenges of minimizing data movement and serialization overhead when connecting scalable big data systems like Apache Spark and Iceberg with machine learning frameworks such as PyTorch. Discover how traditional workflows create costly bottlenecks through data serialization between storage formats like Parquet/Iceberg and training frameworks, leading to inefficient resource utilization in distributed training environments. Examine a comprehensive cloud-native solution that leverages Kubeflow for end-to-end machine learning orchestration and Apache Arrow for high-performance data interchange, enabling seamless integration of analytics and ML workflows while optimizing performance and resource efficiency.
Syllabus
Bridging Big Data and Machine Learning Ecosystems: A Cloud Native Approac... Johnu George & Shiv Jha
Taught by
CNCF [Cloud Native Computing Foundation]