The Most Addictive Python and SQL Courses
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to benchmark and optimize storage systems for AI workloads through this conference presentation that addresses the critical gap between traditional storage benchmarks and the complex I/O demands of modern AI training. Discover how state-of-the-art AI models and LLMs create unprecedented storage challenges characterized by heavy metadata operations, multi-threaded asynchronous I/O, random access patterns, and complex data formats that traditional throughput-focused benchmarks fail to capture. Explore the MLPerf Storage Working Group's comprehensive benchmark suite development, with a detailed focus on the DLIO benchmark designed to realistically model these complex I/O behaviors. Examine technical lessons learned from benchmark development and submission cycles, including critical I/O access patterns identified in training pipelines such as data loading and model checkpointing. Gain actionable insights for designing and configuring storage hardware and software stacks to support AI workloads effectively. Analyze I/O behavior patterns that directly inform system architecture decisions and learn to leverage open-source tools to identify and resolve storage bottlenecks in AI environments. Master the identification of specific I/O bottlenecks in AI training workloads that traditional storage benchmarks overlook, including metadata contention and asynchronous I/O patterns, while learning to use the DLIO benchmark for realistic evaluation and comparison of storage system performance for AI and LLM workloads.
Syllabus
SNIA SDC 2025 - Beyond Throughput: Benchmarking Storage for the Complex I/O Patterns of AI
Taught by
SNIAVideo