Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Stuck in Tutorial Hell? Learn Backend Dev the Right Way
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
This talk by Ethan Perez from Anthropic explores the concept of controlling untrusted AI systems through monitoring mechanisms. During the hour-long presentation, Perez discusses approaches for implementing safety guarantees for Large Language Models (LLMs) by using monitoring systems that can detect and prevent potentially harmful outputs or behaviors. Learn about cutting-edge techniques for maintaining control over increasingly powerful AI systems, even when the underlying models themselves cannot be fully trusted or verified for safety compliance.
Syllabus
Controlling Untrusted AIs With Monitors
Taught by
Simons Institute