Explore the mathematical foundations underlying large machine learning models in this lecture that challenges classical theoretical wisdom about AI. Discover why modern AI models succeed despite using non-convex optimization, high complexity that often memorizes training data, and architectures that rarely offer parsimonious representations of target distributions. Survey recent theoretical work addressing these conceptual challenges and understand how learning can occur in completely unexpected scenarios. Examine the disconnect between traditional machine learning theory that recommended convex optimization and controlled model complexity versus the reality of successful modern AI systems. Gain insights into the mathematical principles that govern large-scale machine learning models and the ongoing research efforts to bridge the gap between classical theory and contemporary AI practice.