Exploring the High-dimensional Random Landscapes of Data Science - Part 1
Institut des Hautes Etudes Scientifiques (IHES) via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the complex mathematical foundations underlying machine learning and data science optimization in this comprehensive lecture that examines high-dimensional random landscapes and their topological properties. Delve into the challenges of optimizing complex random functions in very high dimensions, focusing on how standard algorithms like Stochastic Gradient Descent (SGD) perform in these difficult contexts. Begin with an accessible introduction to the framework covering typical machine learning tasks and neural network structures before progressing to classical SGD applications in finite dimensions. Investigate the Tensor PCA model as a key example, understanding its relationship to spherical spin glasses and examining how simple algorithms perform in single spike estimation tasks. Advance to single index models and discover the concept of "effective dynamics" and "summary statistics" that operate in reduced dimensions to govern algorithm performance. Learn how systems identify these summary statistics through dynamical spectral transitions, where Gram matrices and Hessian matrices develop outliers along optimization trajectories. Master essential Random Matrix Theory tools including spectral edge behavior and the BBP transition, applying these concepts to practical machine learning examples such as multilayer neural networks for Gaussian mixture classification, XOR problems, and multi-spike Tensor PCA models.
Syllabus
Gérard Ben Arous - 1/3 Exploring the High-dimensional Random Landscapes of Data Science
Taught by
Institut des Hautes Etudes Scientifiques (IHES)