Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Efficient AI Serving at Scale - Processing Near Memory Acceleration for LLMs and Vector Search

Open Compute Project via YouTube

Start learning Write review

Details

Start learning

Provider

YouTube
Pricing

Free Video
Languages

English
Effort

18 minutes
Sessions

Self-Paced
Level

Advanced

Found in

Learn about cutting-edge processing near memory acceleration techniques for large language models and vector search applications in this 18-minute conference talk from the Open Compute Project. Discover how to achieve efficient AI serving at scale through innovative hardware acceleration approaches presented by senior engineering leaders from Marvell, including insights into memory-centric computing architectures that optimize performance for modern AI workloads. Explore the technical challenges and solutions for deploying LLMs and vector search systems at enterprise scale, with detailed discussions on how near-memory processing can dramatically improve throughput and reduce latency for AI inference tasks.

Syllabus

Efficient AI Serving at Scale Processing Near Memory Acceleration for LLMs and Vector Search

Taught by

Open Compute Project

Reviews

Start your review of Efficient AI Serving at Scale - Processing Near Memory Acceleration for LLMs and Vector Search

Start learning

Free courses from frontend to fullstack and AI

PowerBI Data Analyst - Create visualizations and dashboards from scratch

Taught by

Become an AI & ML Engineer with Cal Poly EPaCE — IBM-Certified Training

Accelerating AI Data Processing at Scale: Driving for Efficiency and Sustainability

Memory Wall Mitigation and Acceleration of AI Workloads Using CXL Near Memory Computing

Processing Near HBM for HPC and AI

Programming Near Data Processing - Best Practices and Industry Standards

Near Memory Compute for AI Inferencing - Optimizing Data Center Design and TCO

Learn Generative AI, Prompt Engineering, and LLMs for Free Ad

11 Best Embeddings & Transformer Models Courses in 2026 (Free & Paid): Word2Vec, Vector Search, and RAG

[2026] Massive List of Thousands of Free Certificates and Badges

Write Prompts That Actually Work: ZTM’s Prompt Engineering Bootcamp Review

25 Resources to Learn Generative Engine Optimization in 2026

A Free Tool to Learn Languages Through Netflix and YouTube: Language Reactor Review

Never Stop Learning.