Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Distributed ML Training with KubeRay at Robinhood

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore how Robinhood scaled its machine learning platform to support large-model and large-dataset training through distributed training with KubeRay in this 32-minute conference talk from Ray Summit 2025. Learn from Lanting Chiang and Robert Macy as they detail Robinhood's journey from single-node training limitations to implementing distributed training capabilities essential for future model development. Discover the evaluation process and architectural decisions that led to adopting KubeRay for large-scale distributed training, including how Ray was integrated into their existing ML training stack. Understand the platform-level abstractions Robinhood built to make distributed training seamless and accessible for internal teams, and examine how their unique Kubernetes environment influenced their choice between native KubeRay components and alternative solutions. Gain practical insights into integrating Ray into a production ML platform, including lessons learned, architectural best practices, and strategies for enabling distributed training at scale in real-world enterprise environments.

Syllabus

Ray @ Robinhood: Distributed ML Training with KubeRay | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of Distributed ML Training with KubeRay at Robinhood

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.