Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

A Framework for Designing Non-Diagonal Adaptive Training Methods

Institute for Pure & Applied Mathematics (IPAM) via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a 49-minute conference talk presented by Wu Lin from the Vector Institute at IPAM's Theory and Practice of Deep Learning Workshop. Delve into a framework for designing non-diagonal adaptive training methods in deep learning optimization. Discover how probabilistic reformulation of optimization problems can exploit the Fisher-Rao geometric structure of probability families. Learn about new quasi-Newton methods for large-scale neural network training that leverage geometric structures. Examine the second-order perspective on adaptive methods like RMSProp and full-matrix AdaGrad. Understand the concept of preconditioner invariance and its application in making non-diagonal adaptive methods inverse-free while maintaining preconditioner structures for modern mini-batch training with low precision. Investigate Kronecker-factored adaptive methods as a bridge between non-diagonal and diagonal adaptive methods. Gain insights into the advantages of these methods for training large neural networks in half-precision, eliminating numerically unstable and computationally intensive matrix decompositions and inversions.

Syllabus

Wu Lin - A framework for designing (non-diagonal) adaptive training methods - IPAM at UCLA

Taught by

Institute for Pure & Applied Mathematics (IPAM)

Reviews

Start your review of A Framework for Designing Non-Diagonal Adaptive Training Methods

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.