Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Attention Layers and Single-Location Regression - A Theoretical Analysis

Centre International de Rencontres Mathématiques via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch a 45-minute conference talk exploring the theoretical foundations of attention-based models in machine learning, delivered at the Centre International de Rencontres Mathématiques in Marseille, France. Delve into the single-location regression task, where outputs depend on a single token within a sequence, with its position determined through linear projection. Learn about a simplified non-linear self-attention layer predictor that demonstrates asymptotic Bayes optimality and exhibits effective learning capabilities despite non-convex optimization challenges. Understand how attention mechanisms handle sparse token information and internal linear structures, contributing to the theoretical understanding of models like Transformer. Access this presentation through CIRM's Audiovisual Mathematics Library, featuring chapter markers, keywords, abstracts, bibliographies, and Mathematics Subject Classification for enhanced navigation and comprehension.

Syllabus

Claire Boyer: Attention layers provably solve single-location regression

Taught by

Centre International de Rencontres Mathématiques

Reviews

Start your review of Attention Layers and Single-Location Regression - A Theoretical Analysis

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.