Co-Designing for Scale - CXL Based Memory Solution for Data Centric Workloads
Open Compute Project via YouTube
Free courses from frontend to fullstack and AI
Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore hardware-software co-optimized solutions for memory-bound workloads in this 30-minute conference talk from the Open Compute Project. Learn how to leverage disaggregated and composable CXL resources to augment memory-constrained AI accelerators through flexible system architectures designed for data-intensive applications including billion-scale vector search, cloud-native and in-memory databases, data analytics, and large-scale AI inferencing. Discover innovative approaches featuring switch-based disaggregated memory, Near Memory Compute accelerators, and memory pooling appliances that improve efficiency, reduce total cost of ownership, and enhance system performance. Examine how advanced open-source frameworks such as NVIDIA Dynamo, along with KV-cache-focused solutions like Mooncake and LMCache, address software challenges introduced by heterogeneity while lowering adoption barriers for cost-effective, scalable, and energy-efficient AI infrastructure. The presentation features insights from industry experts including Gaurav Agarwal from Marvell, Anil Godbole from Intel, Jianping Jiang from Xconn Technologies Holdings, and Xinjun Yang from Alibaba, providing comprehensive perspectives on cutting-edge CXL-based composable memory systems.
Syllabus
Co Designing for Scale CXL Based Memory Solution for Data Centric Workloads
Taught by
Open Compute Project