Bridging Big Data and Machine Learning Ecosystems - A Cloud Native Approach Using Kubeflow
CNCF [Cloud Native Computing Foundation] via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
AI Adoption - Drive Business Value and Organizational Impact
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to bridge the gap between big data systems and machine learning frameworks using cloud-native technologies in this 33-minute conference talk from CNCF. Explore the critical challenges of minimizing data movement and serialization overhead when connecting scalable big data systems like Apache Spark and Iceberg with machine learning frameworks such as PyTorch. Discover how traditional workflows create costly bottlenecks through data serialization between storage formats like Parquet/Iceberg and training frameworks, leading to inefficient resource utilization in distributed training environments. Examine a comprehensive cloud-native solution that leverages Kubeflow for end-to-end machine learning orchestration and Apache Arrow for high-performance data interchange, enabling seamless integration of analytics and ML workflows while optimizing performance and resource efficiency.
Syllabus
Bridging Big Data and Machine Learning Ecosystems: A Cloud Native Approac... Johnu George & Shiv Jha
Taught by
CNCF [Cloud Native Computing Foundation]