Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore how mathematical fine-tuning techniques can inadvertently compromise AI safety guardrails through geometric principles in this 18-minute video analysis. Delve into Princeton University's groundbreaking research on "The Geometry of Alignment Collapse," which reveals the paradoxical relationship between beautiful mathematical optimization and AI safety measures. Examine the newly discovered Time^4 scaling phenomenon and understand how geometric transformations during fine-tuning processes can systematically break down safety mechanisms that were previously thought to be robust. Learn about the alignment paradox where improving AI performance through mathematical refinement can simultaneously weaken the very safety measures designed to keep AI systems aligned with human values and intentions.
Syllabus
How Geometry Destroys AI Safety: NEW Time^4 Scaling (Princeton)
Taught by
Discover AI