Accelerating AI Data Processing at Scale: Driving for Efficiency and Sustainability

This 19-minute conference talk features Gaurav Agarwal (Marvell - Sr. Director Engineering), Nilesh Shah (zeropoint technologies - VP Business Development), and Parvez Shaik (Rambus - Senior Director of Product Management and Engineering) discussing innovative approaches to AI data processing challenges. Explore how data-centric AI applications handling massive datasets—potentially reaching trillions of parameters and petascale sizes—require new computing paradigms. Learn about the limitations of contemporary memory-constrained AI accelerators when dealing with sparsity, low arithmetic intensity operations, and irregular memory access patterns in modern AI workloads. Discover a new heterogeneous architecture approach using CXL-based near-memory compute accelerators designed to reduce data movement, lower energy consumption, enhance performance, and improve operational efficiency. The presentation covers flexible system designs optimized for data-intensive tasks including vector searches, graph analytics, and large language model serving at scale. Based on the Polymorphic Architecture from the OCP AI HW-SW Co-Design Workgroup, this forward-looking approach emphasizes the importance of co-design and cross-industry collaboration to overcome adoption barriers for heterogeneous compute architectures.