Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

How DataRobot Parallelizes Agentic Pipeline Searches with Ray and syftr

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how DataRobot leverages Ray to parallelize agentic pipeline searches through their innovative syftr framework in this 16-minute conference talk from Ray Summit 2025. Discover the complexities of building high-quality agentic pipelines that require careful selection and tuning of components including vector databases, embedding models, chunkers, retrievers, synthesizing LLMs, verifiers, rewriters, and rerankers, each with interconnected hyperparameters and tradeoffs among latency, accuracy, and cost. Explore syftr's approach to performing efficient, distributed, multi-objective search across vast configuration spaces (~10²³ possible flows) using advanced Bayesian Optimization to discover Pareto-optimal pipelines that jointly optimize cost and accuracy. Understand how the framework's novel early-stopping mechanism prunes suboptimal candidates to dramatically reduce compute overhead, achieving workflows that are approximately 9× cheaper than highly accurate baselines while retaining most of their accuracy. Examine Ray's critical role in powering large-scale AutoML for agentic pipelines through distributed execution that scales VDB construction across heterogeneous clusters, managing both CPU-heavy small-model pipelines and GPU-dependent large-model pipelines across T4, A100, and H100 configurations. See how Ray Serve automatically scales OSS LLMs and embedding models during search, elastically allocating compute to models favored by the optimizer while scaling others to zero for cost savings. Gain insights into Ray's unified distributed computing model that provides the reliability, elasticity, and developer ergonomics needed for large-scale AI infrastructure research, and learn how these techniques can be applied to design high-performance, cost-efficient, and production-ready agentic pipelines.

Syllabus

How DataRobot Parallelizes Agentic Pipeline Searches with Ray + syftr | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of How DataRobot Parallelizes Agentic Pipeline Searches with Ray and syftr

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.