Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
Learn Backend Development Part-Time, Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn practical design patterns for building scalable synthetic data generation pipelines by combining Ray Data, Ray Serve, and vLLM in this 16-minute conference talk from Ray Summit 2025. Discover how to overcome the challenges of coordinating large numbers of inference calls, managing multi-step agentic workflows, and maintaining reliable throughput across heterogeneous GPU clusters using Ray's unified execution model. Explore concrete implementation strategies including leveraging Ray Data for distributed data ingestion, transformation, batching, and parallelization, while using Ray Serve with vLLM for high-performance inference across multiple agents. Master techniques for integrating agents into multi-step refinement loops to ensure correctness and improve data quality, plus learn effective approaches for managing GPU allocation, backpressure, and autoscaling in synthetic data pipelines. Follow along with a hands-on demonstration of building a two-agent self-refinement loop powered by Ray Serve and vLLM, and see how it integrates seamlessly into a Ray Data workflow to create a robust, end-to-end synthetic data generation pipeline. Gain actionable patterns for constructing scalable synthetic data systems and develop a deeper understanding of how Ray's components combine to power complex, high-throughput LLM workflows essential for modern machine learning development.
Syllabus
LiquidAI’s Approach to Large-Scale Synthetic Data Generation Using Ray | Ray Summit 2025
Taught by
Anyscale