History of Computer Vision and CNN Architectures: From LeNet to Vision Transformers

Explore a 22-minute comprehensive video journey through the evolution of Convolutional Neural Networks (CNNs) in image classification, from their early research foundations to modern developments. Learn about pivotal architectures from the 1989 breakthrough of CNNs with backpropagation, through the revolutionary LeNet-5 in 1998, to the transformative AlexNet in 2012. Discover key innovations including GoogLeNet's Inception Module, VGG networks, Batch Normalization, ResNet's revolutionary approach, DenseNet's connectivity patterns, and MobileNet's efficiency improvements. Examine the shift towards attention mechanisms with Vision Transformers and the latest developments with ConvNext, complete with detailed visualizations and architectural explanations. Access supplementary materials including animations, PowerPoint slides, and a comprehensive Medium article to deepen understanding of these fundamental computer vision concepts.

Syllabus

- Intro
- Visualizing CNNs
- 1989
- 1998 - LeNet 5
- The 2000s
- 2012 - AlexNet
- 2014 - GoogLeNet and Inception Module
- 2014 - VGG
- 2015 - Batch Normalization
- 2015 - Residual Network
- 2016 - DenseNet
- 2017 - MobileNet
- 2018 - MobileNet V2
- 2020 - Vision Transformer
- 2022 - ConvNext
- Outro