Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Optimus - Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

USENIX via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Optimus, a distributed training system designed to accelerate large-scale multi-modal large language model (MLLM) training through innovative bubble exploitation techniques. Discover how researchers from Harvard University, ByteDance, and University of Southern California address the inefficiencies in current MLLM training systems that suffer from substantial GPU bubbles caused by heterogeneous modality models and complex data dependencies in 3D parallelism. Explore the principled analysis demonstrating how scheduling encoder computation within LLM bubbles can significantly reduce training bottlenecks, and understand the system's approach to searching for separate parallel plans for encoders and LLMs while maintaining original data dependencies. Examine the bubble scheduling algorithm that exploits LLM bubbles without breaking model architecture constraints, and delve into the decomposition of encoder layer computation into optimized kernel series for sub-millisecond bubble scheduling. Review experimental results from production cluster testing showing 20.5%-21.3% acceleration in MLLM training performance using ViT-22B and GPT-175B models across 3072 GPUs compared to baseline systems, demonstrating practical improvements for large-scale multimodal AI training infrastructure.

Syllabus

USENIX ATC '25 - Optimus: Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Taught by

USENIX

Reviews

Start your review of Optimus - Accelerating Large-Scale Multi-Modal LLM Training by Bubble Exploitation

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.