Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore transformer architectures in this MIT Deep Learning lecture that delves into the fundamental concepts underlying one of the most influential neural network architectures in modern AI. Learn about the three key components that make transformers work: tokens as discrete units of information, attention mechanisms that allow models to focus on relevant parts of input sequences, and positional codes that help models understand the order of elements. Discover how transformers relate to and build upon other neural network architectures including Multi-Layer Perceptrons (MLPs), Graph Neural Networks (GNNs), and Convolutional Neural Networks (CNNs), understanding them as variations on common computational principles. Gain insights into the theoretical foundations and practical implementations that have made transformers the backbone of breakthrough models in natural language processing, computer vision, and beyond.
Syllabus
Lec 08. Architectures: Transformers
Taught by
MIT OpenCourseWare