Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Interpretability of LLMs - Superposition

UofU Data Science via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the concept of superposition in large language models through this university lecture that examines how neural networks represent and process multiple features simultaneously within individual neurons. Delve into the theoretical foundations of superposition as a key mechanism for understanding how LLMs compress and encode information, investigating how models can represent more features than they have dimensions. Learn about the mathematical principles underlying superposition, its implications for model interpretability, and how this phenomenon affects our ability to understand what language models have learned. Examine research methodologies for detecting and analyzing superposition in neural networks, including techniques for disentangling overlapping representations and measuring feature interference. Discover the challenges superposition presents for mechanistic interpretability and explore current approaches to addressing these obstacles in the quest to make large language models more transparent and explainable.

Syllabus

Announcements
Lecture Starts

Taught by

UofU Data Science

Reviews

Start your review of Interpretability of LLMs - Superposition

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.