Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Learn the Skills Netflix, Meta, and Capital One Actually Hire For
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore a 59-minute lecture by Siva Reddy from IVADO - Mila - McGill University, presented at the Simons Institute, examining the vulnerability of aligned language models to jailbreaking attempts. Investigate how these security exploits transfer across different types of AI systems, including standard large language models, reasoning-enhanced models, and autonomous agents. The presentation, part of the Safety-Guaranteed LLMs series, offers critical insights into the robustness challenges facing AI safety mechanisms and potential implications for developing more secure AI systems.
Syllabus
Robustness of jailbreaking across aligned LLMs, reasoning models and agents
Taught by
Simons Institute