Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Towards Memory Efficient RAG Pipelines with CXL Technology

SNIAVideo via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore advanced memory optimization techniques for RAG (Retrieval-Augmented Generation) pipelines in AI inference systems through this 34-minute conference presentation from SNIA SDC 2025. Learn how CXL (Compute Express Link) technology addresses the significant memory challenges faced during various stages of RAG pipeline processing, including vector embedding creation, Vector DB insertion, and search operations that require substantial transient memory consumption. Discover two key CXL-based approaches: memory pooling for dynamic provisioning based on transient needs, and memory tiering using cheaper, larger capacity memory to reduce locally attached memory costs. Examine the current state of open-source infrastructure supporting these solutions and understand how they achieve significant DRAM cost savings with minimal performance trade-offs. Gain insights into typical memory requirements of VectorDB use cases in AI inference stages and explore how CXL-based methodologies can benefit DRAM Total Cost of Ownership (TCO) requirements. Understand the open-source software infrastructure needed to implement CXL memory pooling and tiering, while discussing potential ideas to bridge existing gaps in the technology stack. Presented by Arun George and Roshan Nair from Samsung Semiconductor India Research, this technical presentation provides practical solutions for optimizing memory efficiency in modern AI inference workloads.

Syllabus

SNIA SDC 2025 - Towards Memory Efficient RAG Pipelines with CXL Technology

Taught by

SNIAVideo

Reviews

Start your review of Towards Memory Efficient RAG Pipelines with CXL Technology

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.