Conceptualizing Next Generation Memory and Storage Optimized for AI Inference
Open Compute Project via YouTube
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Free courses from frontend to fullstack and AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore the evolving landscape of memory and storage systems specifically designed for AI inference applications in this conference talk. Discover how the unprecedented growth of large language models (LLMs) and expanding context lengths are driving massive data access demands, particularly for weights and Key-Value (KV) cache data. Learn about the paradigm shift in memory and storage understanding, where traditional architectures face new challenges requiring compute to be offloaded into or near memory for optimal energy efficiency and power consumption. Examine the unique access patterns of AI inference workloads, including their read-skewed nature compared to general-purpose applications and semi-sequential storage access behaviors that differ from conventional expectations. Gain insights into next-generation memory concepts including Processing-In-Memory (PIM) technology that optimizes energy efficiency and bandwidth by reducing excessive data movement. Understand pathfinding developments in read-skewed high-capacity memory systems and high-performance storage solutions with semi-random access capabilities. Delve into the selection of appropriate interfaces and semantics for these emerging technologies that promise significant improvements in energy efficiency, bandwidth, and capacity for AI systems, presented by Thomas Won Ha Choi, Director and Memory Systems Architect at SK hynix.
Syllabus
Conceptualizing Next Generation Memory & Storage Optimized for AI Inference
Taught by
Open Compute Project