Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

freeCodeCamp

Deep Learning Vision Architectures Explained - CNNs from LeNet to Vision Transformers

via freeCodeCamp

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Embark on a comprehensive 5-hour conceptual and architectural journey through the evolution of deep learning vision models, tracing the development from foundational networks like LeNet and AlexNet to cutting-edge architectures including ResNet, EfficientNet, and Vision Transformers. Explore the fundamental design philosophies that shaped computer vision, including skip connections, bottlenecks, identity preservation, depth/width trade-offs, and attention mechanisms. Begin with the pioneering LeNet architecture before progressing through AlexNet's breakthrough innovations, VGG's simplicity principles, and GoogLeNet's inception modules. Delve into Highway Networks and their information preservation pathways, then examine ResNet's revolutionary residual connections and their variants in Wide ResNet. Discover DenseNet's dense connectivity patterns, Xception's depthwise separable convolutions, and MobileNets' efficiency-focused design. Learn about EfficientNets' compound scaling methodology before concluding with Vision Transformers and their attention-based approach to image processing. Each architecture is presented with clear visual explanations, historical context, and side-by-side comparisons that reveal why these models are structured as they are and how they process visual information, providing both theoretical understanding and practical insights into the evolution of computer vision architectures.

Syllabus

⌨️ 0:00:00 Welcoming and Introduction
⌨️ 0:01:44 What We'll Cover Broadly
⌨️ 0:05:34 LeNet Architecture Model
⌨️ 0:22:51 AlexNet Architecture Model
⌨️ 0:46:26 VGG Architecture Model
⌨️ 1:01:41 GoogLeNet / Inception Architecture Model
⌨️ 1:36:50 Highway Networks Architecture Model
⌨️ 2:00:45 Pathways of Information Preservation
⌨️ 2:18:03 ResNet Architecture Model
⌨️ 2:54:00 Wide ResNet Architecture Model
⌨️ 3:14:11 DenseNet Architecture Model
⌨️ 3:33:47 Xception
⌨️ 3:48:04 MobileNets
⌨️ 4:07:56 EfficientNets
⌨️ 4:24:32 Vision Transformers and The Ending

Taught by

freeCodeCamp.org

Reviews

Start your review of Deep Learning Vision Architectures Explained - CNNs from LeNet to Vision Transformers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.