Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

MIT OpenCourseWare

Architectures - Transformers - Lecture 8

MIT OpenCourseWare via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore transformer architectures in this MIT Deep Learning lecture that delves into the fundamental concepts underlying one of the most influential neural network architectures in modern AI. Learn about the three key components that make transformers work: tokens as discrete units of information, attention mechanisms that allow models to focus on relevant parts of input sequences, and positional codes that help models understand the order of elements. Discover how transformers relate to and build upon other neural network architectures including Multi-Layer Perceptrons (MLPs), Graph Neural Networks (GNNs), and Convolutional Neural Networks (CNNs), understanding them as variations on common computational principles. Gain insights into the theoretical foundations and practical implementations that have made transformers the backbone of breakthrough models in natural language processing, computer vision, and beyond.

Syllabus

Lec 08. Architectures: Transformers

Taught by

MIT OpenCourseWare

Reviews

Start your review of Architectures - Transformers - Lecture 8

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.