Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

AdamW Optimizer from Scratch in Python

Yacine Mahdid via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to implement the AdamW optimizer from scratch in Python through this 20-minute tutorial that breaks down the state-of-the-art optimization algorithm used in most modern deep learning training. Explore how AdamW differs from the traditional Adam optimizer by applying weight decay regularization in a specific way that provides much greater stability. Understand the fundamental differences between L2 regularization and weight decay, discover why Adam with L2 regularization was problematic, and examine the mathematical formulation behind AdamW's superior performance. Follow along with a complete code implementation while gaining insights into where this regularization technique fits within the broader landscape of optimization algorithms and why it has become the preferred choice for training deep neural networks.

Syllabus

- Introduction: 0:00
- Where does AdamW fit?: 2:30
- What type of regularization does AdamW apply?: 3:25
- Why Adam with L2 sucked?: 4:49
- Isn't L2 and Weight Decay the same?: 5:48
- AdamW formula breakdown: 6:19
- AdamW code implementation lol: 14:32
- AdamW Recap: 19:34

Taught by

Yacine Mahdid

Reviews

Start your review of AdamW Optimizer from Scratch in Python

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.