Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How Workday Achieved 50x Cheaper Model Serving with Ray Serve

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how Workday rebuilt its ML model-serving architecture using Ray Serve to achieve 50× cost savings while maintaining low latency and high reliability across tens of thousands of models. Discover the core challenges Workday faced in early 2023 when serving dedicated ML models for every tenant across multiple environments became increasingly expensive and difficult to scale. Explore the ground-up redesign solution built on Ray Serve that now powers models across more than a dozen environments, leveraging built-in autoscaling and efficient request routing. Examine the unique usage patterns that pushed Ray Serve beyond its original design limits, including how early deployments hit scalability ceilings with just dozens of applications and the critical improvements contributed back to the open-source Ray community to support thousands of applications per cluster. Gain deep insights into Ray Serve internals, practical patterns for building complex serving systems, architectural challenges encountered during scaling, and the specific contributions made to overcome these obstacles. Understand how to leverage Ray Serve for large-scale model serving and find inspiration to contribute to the Ray ecosystem yourself.

Syllabus

How Workday Achieved 50x Cheaper Model Serving with Ray Serve | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of How Workday Achieved 50x Cheaper Model Serving with Ray Serve

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.