Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a groundbreaking conference presentation that introduces GMI-DRL, a novel systematic approach for scaling deep reinforcement learning (DRL) across multi-GPU platforms through adaptive-grained parallelism. Learn how researchers from Rice University, UC Santa Barbara, UC San Diego, University of Rochester, and Pacific Northwest National Laboratory address the inefficiencies in DRL computation caused by heterogeneous tasks and complex inter-task interactions on modern powerful multi-GPU systems. Discover the innovative GPU Multiplexing Instance (GMI) concept, which provides a unified resource-adjustable sub-GPU design specifically tailored for heterogeneous DRL scaling tasks. Understand how the adaptive Coordinator component effectively manages workloads and resources to optimize system performance, while the specialized Communicator enables highly efficient inter-GMI communication to meet diverse communication requirements. Examine comprehensive experimental results demonstrating GMI-DRL's superior performance compared to state-of-the-art DRL acceleration solutions, achieving up to 2.34x improvement in training throughput and up to 40.8% enhancement in GPU utilization on the DGX-A100 platform. Gain insights into the growing importance of DRL in robotics applications for industrial control and autonomous driving, and how this research addresses critical scalability challenges in the field.
Syllabus
USENIX ATC '25 - GMI-DRL: Empowering Multi-GPU DRL with Adaptive-Grained Parallelism
Taught by
USENIX