Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

On the Power of Forward Pass Through Transformer Architectures

Harvard CMSA via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a technical seminar presentation from Princeton University's Abhishek Panigrahi examining the computational mechanisms of transformer architectures during forward passes. Delve into two key phenomena: first, discover how moderate-sized BERT language models learn linguistic structures and parse trees during pre-training, with insights gained through synthetic PCFG data and the inside-outside algorithm. Then investigate in-context learning capabilities of large language models through the innovative Transformer in Transformer (TinT) framework, which demonstrates how a 1.3B parameter model can simulate and fine-tune a 125M parameter model in a single forward pass. Learn about the implications of these findings for understanding transformer inference processes and potential architectural improvements. Gain valuable insights into the internal workings of transformer models and their ability to execute complex computational tasks during inference.

Syllabus

Abhishek Panigrahi | On the Power of Forward pass through Transformer Architectures

Taught by

Harvard CMSA

Reviews

Start your review of On the Power of Forward Pass Through Transformer Architectures

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.