Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Evaluating Sparse Autoencoders with Board Game Models

USC Information Sciences Institute via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about groundbreaking research in machine learning interpretability through a seminar presentation by Adam Karvonen from the ML Alignment & Theory Scholars program. Explore the evaluation challenges of Sparse Autoencoders (SAEs) using board game models like OthelloGPT and ChessGPT as test cases. Discover new supervised metrics for assessing feature quality and state capture, including "coverage" and "board reconstruction" measurements. Examine the innovative "p-annealing" training approach and its superior performance compared to existing methods. Gain insights into the current limitations of SAEs in capturing complete board state information, despite achieving high F1 scores of 0.85 and 0.95 on Chess and Othello respectively. Delivered by a machine learning researcher and competitive dirt bike racer, this technical presentation advances the understanding of interpretability techniques in artificial intelligence.

Syllabus

Evaluating Sparse Autoencoders with Board Game Models

Taught by

USC Information Sciences Institute

Reviews

Start your review of Evaluating Sparse Autoencoders with Board Game Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.