Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Chameleon - Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models

Scalable Parallel Computing Lab, SPCL @ ETH Zurich via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a research presentation on Chameleon, a novel heterogeneous accelerator system designed to optimize Retrieval-Augmented Language Models (RALMs) through innovative hardware architecture. Learn how this system combines large language models with vector databases to achieve context-specific knowledge retrieval during text generation, enabling impressive generation quality with smaller models while reducing computational demands by orders of magnitude. Discover the key principles behind Chameleon's disaggregated architecture that integrates both LLM and vector search accelerators, allowing independent scaling to meet diverse RALM requirements. Examine the prototype implementation that utilizes FPGAs for vector search acceleration, GPUs for LLM inference, and CPUs as cluster coordinators. Understand the performance benefits demonstrated through comprehensive evaluation, including up to 2.16× reduction in latency and 3.18× speedup in throughput compared to traditional hybrid CPU-GPU architectures. Gain insights into how heterogeneous accelerator systems can revolutionize both LLM inference and vector search capabilities in future RALM deployments, presented by Wenqi Jiang from the Scalable Parallel Computing Lab at ETH Zurich based on research published in the Proceedings of the VLDB Endowment.

Syllabus

Chameleon: Heterogeneous & Disaggregated Accelerator System for Retrieval-Augmented Language Models

Taught by

Scalable Parallel Computing Lab, SPCL @ ETH Zurich

Reviews

Start your review of Chameleon - Heterogeneous and Disaggregated Accelerator System for Retrieval-Augmented Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.