Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a groundbreaking mathematical lecture that challenges decades of conventional wisdom about gradient-descent-ascent (GDA) algorithms in optimization. Discover how University of Pennsylvania researcher Jason Altschuler demonstrates that GDA can converge in its original form through innovative stepsize scheduling, eliminating the need for complex modifications like extragradients, optimism, or momentum that have dominated the field since the 1970s. Learn about the revolutionary approach of using time-varying, asymmetric, and periodically negative stepsizes to solve min-max problems that are central to optimization, game theory, machine learning, and control systems. Understand the mathematical intuition behind how negative stepsizes create a "slingshot phenomenon" where backward progress de-synchronizes min/max variables, overcoming GDA's traditional cycling issues and enabling faster overall convergence. Examine the geometric perspective of how positive and negative steps cancel to first order, producing second-order net movement in new directions that leverage the non-reversibility of gradient flow. Gain insights into why all three stepsize properties—time-varying, asymmetric, and negative—are mathematically necessary for convergence, and see how this approach successfully handles classical counterexamples like unconstrained convex-concave problems that previously stumped traditional GDA methods.
Syllabus
Jason Altschuler | Negative Stepsizes Make Gradient-Descent-Ascent Converge
Taught by
Harvard CMSA