Emergence of In-Context Learning in Small Transformer Models
International Centre for Theoretical Sciences via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the emergence of in-context learning capabilities in small transformer models through this conference talk delivered at the International Centre for Theoretical Sciences. Discover how transformer architectures develop the ability to learn and adapt to new tasks within their context window, even when operating at smaller scales than typical large language models. Examine the theoretical foundations and empirical evidence for in-context learning phenomena, including the mechanisms that enable these models to perform few-shot learning without explicit parameter updates. Investigate the mathematical principles underlying this emergent behavior and understand how small transformers can exhibit sophisticated learning capabilities that were previously thought to require much larger model architectures. Learn about the implications of these findings for our understanding of neural network learning dynamics and the potential applications in resource-constrained environments where smaller models are preferred.
Syllabus
Emergence of In-context Learning in Small Transformer Models by Gautam Reddy
Taught by
International Centre for Theoretical Sciences