Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Agentic Workload Inference at Scale - ByteDance's AIBrix and DeerFlow

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how ByteDance engineers are revolutionizing large-scale LLM inference infrastructure through their open-source AIBrix control plane and DeerFlow framework in this 31-minute conference talk from Ray Summit 2025. Discover the critical infrastructure challenges facing production-grade language model deployments, where performance, scalability, and cost efficiency must be simultaneously optimized for real-world agentic systems. Explore AIBrix's comprehensive suite of LLM-focused capabilities developed in collaboration with the vLLM community, including workload-aware autoscaling that efficiently manages resources, sophisticated KVCache management utilizing multi-level caching and prefix-aware reuse to reduce memory pressure, and intelligent load balancing with cache-aware routing for adaptive traffic distribution under varying load patterns. Examine cutting-edge innovations such as dynamic LoRA orchestration and heterogeneous hardware support designed to maximize cost effectiveness across diverse cluster environments. Witness practical demonstrations of how AIBrix enables advanced agentic workloads through DeerFlow, an open-source deep research framework, including real-world applications like building personal research assistants on open-source LLMs. Gain comprehensive insights into AIBrix's architecture, performance breakthroughs, and its pivotal role in shaping the future of enterprise-grade LLM infrastructure for next-generation AI agents requiring reliable, scalable, and low-latency execution.

Syllabus

Agentic Workload Inference at Scale: ByteDance’s AIBrix & DeerFlow | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of Agentic Workload Inference at Scale - ByteDance's AIBrix and DeerFlow

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.