MIT Sloan AI Adoption: Build a Playbook That Drives Real Business ROI
Master Agentic AI, GANs, Fine-Tuning & LLM Apps
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off your first 3 months — limited time.
Unlock All Certificates
Discover strategies to optimize I/O performance and maintain GPU utilization during machine learning model training in the cloud. This 27-minute conference talk explores the challenges of data-intensive training processes, focusing on the frequent I/O requirements of small files like images and audio. Learn about a novel architecture designed to enhance the entire data pipeline and sustain the high throughput demanded by GPUs. Gain insights into implementing this architecture for PyTorch workloads on Kubernetes in public cloud environments, addressing the unique data access patterns and I/O challenges specific to model training compared to traditional data analytics.
Syllabus
How to Eliminate the I/O Bottleneck and Continuously Feed the GPU While Training in the... - Lu Qiu
Taught by
Linux Foundation