Conceptualizing Next Generation Memory and Storage Optimized for AI Inference
Open Compute Project via YouTube
Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
Google, IBM & Microsoft Certificates — All in One Plan
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the evolving landscape of memory and storage systems specifically designed for AI inference applications in this conference talk. Discover how the unprecedented growth of large language models (LLMs) and expanding context lengths are driving massive data access demands, particularly for weights and Key-Value (KV) cache data. Learn about the paradigm shift in memory and storage understanding, where traditional architectures face new challenges requiring compute to be offloaded into or near memory for optimal energy efficiency and power consumption. Examine the unique access patterns of AI inference workloads, including their read-skewed nature compared to general-purpose applications and semi-sequential storage access behaviors that differ from conventional expectations. Gain insights into next-generation memory concepts including Processing-In-Memory (PIM) technology that optimizes energy efficiency and bandwidth by reducing excessive data movement. Understand pathfinding developments in read-skewed high-capacity memory systems and high-performance storage solutions with semi-random access capabilities. Delve into the selection of appropriate interfaces and semantics for these emerging technologies that promise significant improvements in energy efficiency, bandwidth, and capacity for AI systems, presented by Thomas Won Ha Choi, Director and Memory Systems Architect at SK hynix.
Syllabus
Conceptualizing Next Generation Memory & Storage Optimized for AI Inference
Taught by
Open Compute Project