Statistical Learning Theory and Neural Networks II

Delve into the second part of a comprehensive tutorial on statistical learning theory and neural networks. Explore approaches for understanding neural network training from an optimization perspective, beginning with a review of classical analysis of gradient descent on convex and smooth objectives. Examine the Polyak--Lojasiewicz (PL) inequality and its interpretation in the context of neural network training. Investigate the neural tangent kernel (NTK) regime, a particular setting where neural network training is well-approximated by kernel methods. Learn how to establish a PL inequality for neural networks using two approaches: a general method based on the NTK approximation and a specific technique for linearly-separable data. This advanced tutorial, presented by Spencer Frei from UC Berkeley as part of the Deep Learning Theory Workshop and Summer School at the Simons Institute, builds upon the foundations laid in the first session and offers deeper insights into the optimization aspects of neural network training.