Meet verl - An RL Framework for LLM Reasoning and Tool Use

Learn about verl, an open-source reinforcement learning framework specifically designed for large language models, in this 30-minute conference talk from Ray Summit 2025. Discover how Bytedance Seed's team tackles the significant challenges of scaling RL with billion-parameter models, where existing frameworks often lack proper abstractions for orchestrating complex dataflows and efficiently managing resources across large GPU clusters. Explore verl's innovative Ray-based hybrid-controller architecture that provides high-level abstractions for dataflow orchestration and resource management, with the entire RL workflow running as a single controller process on the Ray driver while delegating computation to WorkerGroup and ResourcePool components for distributed work across clusters. Understand how this framework achieves high throughput and strong extensibility while integrating seamlessly with major training backends including FSDP, FSDP2, and Megatron-LM, inference engines like vLLM and SGLang, and supporting various RL algorithms such as PPO, GRPO, and DAPO with effortless scaling capabilities. Gain insights into how verl has gained adoption in both academic and industry settings, making reinforcement learning more accessible and scalable for large language model applications in reasoning and tool use scenarios.