Optimizing AI Inferencing with CXL Memory - Memory Tiering Strategies for Enhanced Performance
Open Compute Project via YouTube
Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how CXL-attached memory can revolutionize AI inference technology and enhance performance for Large Language Models (LLMs) in this 20-minute technical presentation from Astera Labs experts. Explore memory tiering strategies that optimize AI inference platforms, focusing on how Compute Express Link (CXL) technology enables improved performance, scalability, and cost-effectiveness for memory-intensive applications. Discover techniques for enhancing CPU and GPU utilization, minimizing latency, and increasing throughput when working with large datasets. Gain valuable insights into the emerging role of CXL memory architecture and its potential impact on advancing Generative AI capabilities.
Syllabus
Optimizing AI Inferencing with CXL Memory
Taught by
Open Compute Project