Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Layering Every Technique in RAG - One Query at a Time

AI Engineer via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build sophisticated Retrieval-Augmented Generation (RAG) systems through an incremental, technique-by-technique approach in this 20-minute conference talk. Start with basic in-memory embeddings and relevance ranking, then progressively layer advanced techniques to reach planet-scale search capabilities handling 160,000 queries per second. Explore the evolution from simple BM25 term-based retrieval to complex systems combining 70+ corpus mix of tokens, embeddings, and knowledge graphs with joint retrieval, custom ranking, and LLM processing. Discover why certain queries like "falafel" present unique search challenges, understand when document chunking can be counterproductive, and learn to identify scenarios where simple BM25 suffices versus when problems are better delegated to LLMs or user experience design. Master the quality engineering mindset essential for building robust search systems, covering vector search with relevance embeddings, cross-encoder re-rankers, custom embeddings, domain-specific ranking signals, user preference integration, query orchestration with fan-out strategies, supplementary retrieval methods, distillation techniques, and graceful degradation approaches. Gain insights from Google Search's core AI and NLU systems development experience, learning practical strategies for handling increasingly complex RAG queries while understanding the limitations and capabilities unlocked by each layered technique.

Syllabus

00:00 Introduction and Context
01:41 Quality Engineering Loop and Mindset
04:09 In-Memory Retrieval
04:50 Term-Based Retrieval BM25
05:18 Relevance Embeddings Vector Search
06:15 Re-Rankers Cross Encoders
07:59 Custom Embeddings
09:40 Domain-Specific Ranking Signals
11:09 User Preference Signals
12:17 Query Orchestration Fan Out
14:26 Supplementary Retrieval
16:09 Distillation
17:14 Punting the Problem and Graceful Degradation

Taught by

AI Engineer

Reviews

Start your review of Layering Every Technique in RAG - One Query at a Time

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.