Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn practical design patterns for building scalable synthetic data generation pipelines by combining Ray Data, Ray Serve, and vLLM in this 16-minute conference talk from Ray Summit 2025. Discover how to overcome the challenges of coordinating large numbers of inference calls, managing multi-step agentic workflows, and maintaining reliable throughput across heterogeneous GPU clusters using Ray's unified execution model. Explore concrete implementation strategies including leveraging Ray Data for distributed data ingestion, transformation, batching, and parallelization, while using Ray Serve with vLLM for high-performance inference across multiple agents. Master techniques for integrating agents into multi-step refinement loops to ensure correctness and improve data quality, plus learn effective approaches for managing GPU allocation, backpressure, and autoscaling in synthetic data pipelines. Follow along with a hands-on demonstration of building a two-agent self-refinement loop powered by Ray Serve and vLLM, and see how it integrates seamlessly into a Ray Data workflow to create a robust, end-to-end synthetic data generation pipeline. Gain actionable patterns for constructing scalable synthetic data systems and develop a deeper understanding of how Ray's components combine to power complex, high-throughput LLM workflows essential for modern machine learning development.
Syllabus
LiquidAI’s Approach to Large-Scale Synthetic Data Generation Using Ray | Ray Summit 2025
Taught by
Anyscale