Understanding Retrieval Heads in Large Language Models - From Discovery to Applications
Discover AI via YouTube
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Stuck in Tutorial Hell? Learn Backend Dev the Right Way
Overview
Syllabus
Intro Green grasshoppers
What do attention heads focus on?
Long context Factuality by retrieval heads
Needle in a Haystack Benchmark
How many retrieval heads in a LLM?
What is a retrieval head?
Retrieval heatmap consistent with pre-trained base model
Retrieval heads and Chain-of-Thought Reasoning
Retrieval heads explain why LLMs hallucinate
How to generate more retrieval heads in LLMs?
Taught by
Discover AI