SkyRL - A Scalable and Flexible Post-Training Framework for Agentic Language Models

Discover how to build scalable reinforcement learning frameworks for agentic language models through this 32-minute conference talk from Ray Summit 2025. Learn about SkyRL's modular architecture and design principles that enable rapid experimentation while maintaining scalability and efficiency for post-training applications. Explore the key components including modular policy training, customizable reward pipelines, and efficient distributed execution that allow researchers to iterate quickly without compromising performance guarantees. Understand the infrastructure challenges involved in scaling RL for agentic workloads, including supporting high-throughput environment interactions, optimizing distributed rollouts, and managing heterogeneous compute requirements across diverse tasks and model sizes. Gain insights from the development team's lessons learned about how principled framework design can accelerate RL research, improve reproducibility, and unlock new possibilities for training advanced agentic language models.