Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling Multi-Modal Datasets to Petabytes with Ray at Apple

Anyscale via YouTube

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to scale multimodal data processing frameworks to handle petabyte-scale workloads from Apple engineers who share their real-world experience at Ray Summit 2025. Discover the rising demand for large-scale processing over multimodal datasets and understand the challenges of designing systems that balance user-friendliness with extreme scalability capabilities. Explore how Ray's distributed computing model provides flexibility while requiring careful attention to bottlenecks across data processing layers, Ray's core runtime, and underlying infrastructure to achieve petabyte-level performance. Examine key strategies for mitigating performance bottlenecks, ensuring infrastructure resilience, and tuning Ray's internal components to prevent cascading failures in massive, tightly coupled distributed systems. Gain practical insights from architecting and productionizing Apple's petabyte-scale multimodal pipeline as a self-serve service that balances usability with the ability to process enormous datasets reliably and efficiently. Walk away with actionable knowledge for building and operating large-scale multimodal data workflows in real-world production environments as LLM applications continue to grow in complexity.

Syllabus

Scaling Multi-Modal Datasets to Petabytes with Ray at Apple | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of Scaling Multi-Modal Datasets to Petabytes with Ray at Apple

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.