Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Distributed Model Training with Ray at Capital One

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how Capital One modernized their complex, multi-framework ML ecosystem by adopting Ray to move from CPU-bound limitations to scalable, GPU-accelerated distributed computing in this 30-minute conference talk from Ray Summit 2025. Discover Capital One's distributed compute architecture where Kubernetes manages cluster resources and the KubeRay Operator deploys and orchestrates GPU-enabled, multi-node Ray clusters. Follow the technical evolution from a constrained, single-node Ray setup to a robust distributed platform designed to support diverse ML workloads across teams. Explore a real-world use case focusing on Ray Tune for distributed hyperparameter optimization, including the significant data loading bottleneck encountered after migrating to KubeRay that impacted performance. Understand the root causes, symptoms of GPU underutilization and network congestion, and the debugging process approach. Examine a comparative study of two data loading strategies: Ray Data (Ray's distributed data ingestion and preprocessing framework) versus a custom manual sharding technique developed in-house. Review metrics across memory usage, network I/O, and GPU utilization showing early results that indicate substantial cost efficiency improvements for large GPU workloads. Gain practical insights into scaling ML compute with Ray, diagnosing real-world performance bottlenecks, and choosing the right data strategy for high-throughput distributed training.

Syllabus

Distributed Model Training with Ray at Capital One | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of Distributed Model Training with Ray at Capital One

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.