Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Exploiting Low-Dimensional Data Structures and Understanding Neural Scaling Laws of Transformers

Institute for Pure & Applied Mathematics (IPAM) via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the theoretical foundations behind transformer scaling laws in this 41-minute conference presentation from IPAM's Scientific Machine Learning Workshop. Discover how low-dimensional data structures in language datasets can explain why transformer-based large language models exhibit predictable power scaling laws dependent on model size and data size. Learn about the intrinsic dimension estimation of language datasets and examine statistical estimation and mathematical approximation theories for transformers that predict these scaling phenomena. Understand how exploiting low-dimensional data structures provides insights into transformer behavior that respects data geometry, and review empirical validation through trained language models that demonstrate strong agreement between observed scaling laws and theoretical predictions.

Syllabus

Wenjing Liao - Exploiting Low-Dimensional Data Structure & Understanding Neural Scaling of Trans...

Taught by

Institute for Pure & Applied Mathematics (IPAM)

Reviews

Start your review of Exploiting Low-Dimensional Data Structures and Understanding Neural Scaling Laws of Transformers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.