Linear Representations of Concepts in Modern AI Models

Explore how concepts are linearly represented within modern AI models through this 37-minute conference talk by Mikhail Belkin from UCSD, presented at the Simons Institute's "Smale@95: A Conference in Honor of Steve Smale." Discover how trained Large Language Models contain vast amounts of human knowledge and learn how many concepts can be recovered from neural network internal activations using linear "probes" - mathematically equivalent to single index models. Examine the construction and application of these probes based on Recursive Feature Machines, a feature-learning kernel method originally developed for extracting relevant features from tabular data, and understand their role in interpreting the internal representations of AI systems.