Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Optimizing RLHF Training for Large Language Models with Stage Fusion

USENIX via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about RLHFuse, an innovative training system that optimizes Reinforcement Learning from Human Feedback (RLHF) for large language models through stage fusion techniques in this 13-minute conference presentation from NSDI '25. Discover how researchers from Peking University and StepFun address the critical challenges of low GPU utilization in existing RLHF systems caused by data skewness in generation stages and pipeline bubbles in training stages. Explore the system's revolutionary approach that breaks away from traditional RLHF workflows by splitting tasks into finer-grained subtasks and implementing stage fusion to dramatically improve performance. Understand the two key innovations: inter-stage fusion that overlaps generation and inference stages through sample-level subtasks to eliminate bottlenecks from long-tailed samples, and intra-stage fusion that concurrently executes micro-batch subtasks with a fused pipeline schedule to reduce pipeline bubbles. Examine experimental results demonstrating up to 3.7× improvement in training throughput compared to existing systems, making this essential viewing for researchers and practitioners working on large language model training optimization and distributed machine learning systems.

Syllabus

NSDI '25 - Optimizing RLHF Training for Large Language Models with Stage Fusion

Taught by

USENIX

Reviews

Start your review of Optimizing RLHF Training for Large Language Models with Stage Fusion

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.