Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore a conference talk from USENIX ATC '23 that introduces EnvPipe, an innovative DNN training framework designed to save energy without compromising performance. Learn how EnvPipe maximizes energy efficiency in multi-GPU DNN training by leveraging slack time created by pipeline parallelism bubbles. Discover the framework's approach to stretching execution time of pipeline units through SM frequency reduction, while maintaining the original accuracy of training tasks. Gain insights into EnvPipe's implementation as a PyTorch library and its impressive energy-saving results: up to 25.2% in single-node training with 4 GPUs and 28.4% in multi-node training with 16 GPUs, all while keeping performance degradation below 1%. Understand the significance of this research in addressing energy consumption challenges in data centers, particularly for DNN training and inference services.
Syllabus
USENIX ATC '23 - EnvPipe: Performance-preserving DNN Training Framework for Saving Energy
Taught by
USENIX