Coursera Plus Annual is 40% Off — Ends July 6.

View

Class Central

Rankings
Career Certificates

Subjects

View all

Technology
Business
Creative
STEM
Health & Wellness
People & Society
Personal Growth & Lifestyle

View all Subjects

Universities
The Report

Courses from 1000+ universities

Rankings

Best Courses

Best of All Time
Best of the Year

Most Popular Courses

Most Popular of All Time
Most Popular of the Year

Career Certificates

Generative AI
UX Design
Data Science
Finance
DevOps
Project Management
View all Career Certificates

Technology

Computer Science
Artificial Intelligence
Data Science
Web Development
Programming
Databases
IT & Networking
DevOps
UX/UI Design
Generative AI
Cloud Computing
Cybersecurity
Game Development
Product Management
View all Technology

Business

Management & Leadership
Finance
Entrepreneurship
Marketing
Strategic Management
Industry Specific
Project Management
Sales
Business Software
Real Estate
Professional Development
Accounting
Supply Chain Management
Business Strategy
Human Resources
View all Business

Creative

Music
Digital Media
Visual Arts
Design
Crafts
Performing Arts
AI Art
View all Creative

STEM

Mathematics
Engineering
Science
Statistics & Probability
Chemistry
Physics
Environmental Science
Astronomy
Biology
View all STEM

Health & Wellness

Nutrition & Wellness
Disease & Disorders
Public Health
Health Care
Nursing
Mental Health
Continuing Medical Education (CME)
Medicine
Wellness
Nutrition
View all Health & Wellness

People & Society

Humanities
Education & Teaching
Social Sciences
History
Literature
Sociology
Economics
Psychology
Anthropology
Political Science
Law
Language Learning
Writing
Philosophy
Religion
Sustainability
View all People & Society

Personal Growth & Lifestyle

Personal Development
Sports & Recreation
Personal Finance
Parenting & Family
Food & Drink
Self-Defense & Martial Arts
Gardening
Productivity & Time Management
Games
Study Skills
Travel
Pets & Pet Care
Beauty & Makeup
Critical Thinking
View all Personal Growth & Lifestyle

The Report

A Simplilearn Certificate Goes on LinkedIn Each Minute; Here’s How Krishna Kumar Built It From a Blog

17 years ago, Krishna Kumar started offering free PMP prep online. Today, it’s a leading digital upskilling platform that helps millions upskill in AI, cybersecurity, data science, and more.

Class Central Team Jun 22, 2026

Latest

8 Best SolidWorks Courses for 2026
The Business of Online Education: Khan Academy Tax Returns Analysis (2008–2025)
10 Best Spring Boot Courses for 2026: Beginner to Advanced
Massive List of Online Learning Platforms in China
Best SQL Courses for 2026: Top 14 from 3,300+

Write for The Report

Visit The Report

600 Free Google Certifications
Trending

Most common

Popular subjects

Computer Science
52,161 courses
Artificial Intelligence
32,440 courses
Digital Skills
437 courses

Popular courses

Introduction to HTML5
University of Michigan
Rome: A Virtual Tour of the Ancient City
University of Reading
The Modern World, Part One: Global History from 1760 to 1910
University of Virginia

Organize and share your learning with Class Central Lists.

View our Lists Showcase

0 Reviews

Start learning

Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Technology
Artificial Intelligence
Natural Language Processing (NLP)
LLM (Large Language Model)

Technology
Artificial Intelligence
Neural Networks

Technology
Artificial Intelligence
Natural Language Processing (NLP)
LLM (Large Language Model)

Technology
Computer Science
Machine Learning
Language Models

Technology
Computer Science
Machine Learning
Model Interpretability

Technology
Computer Science
Machine Learning

Probabilistic Safety Guarantees Using Model Internals

Simons Institute via YouTube

Write review

Start learning Write review

Details

Start learning

Provider

YouTube
Pricing

Free Video
Languages

English
Effort

46 minutes
Sessions

Self-Paced
Level

Advanced

Found in

LLM (Large Language Model) Courses
Neural Networks Courses
Language Models Courses
Model Interpretability Courses
Machine Learning Courses

This talk by Jacob Hilton from the Alignment Research Center explores how to establish probabilistic safety guarantees for large language models by examining their internal mechanisms. Learn about innovative approaches to ensuring AI safety through model internals analysis, as presented at the Simons Institute's Safety-Guaranteed LLMs event. The 46-minute presentation delves into technical methods for creating more reliable safety assurances in advanced AI systems.

Syllabus

Probabilistic Safety Guarantees Using Model Internals

Taught by

Simons Institute

Related Courses

Ad

Learn AI, Data Science & Business — Earn Certificates That Get You Hired

Learn More →
Future Directions in AI Safety: Safety-Guaranteed LLMs
Out Of Distribution, Out Of Control? Understanding Safety Challenges In AI
Antidistillation Sampling for Safety-Guaranteed LLMs
LLM Safety, Alignment, and Generalization
Scalably Understanding AI with AI - Safety-Guaranteed LLMs