Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Probably Approximately Correct Labels

Harvard CMSA via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a mathematical framework for cost-effective dataset labeling that combines expert annotations with AI predictions in this conference talk from Harvard's Center of Mathematical Sciences and Applications. Learn how to construct high-quality labeled datasets by supplementing expensive human annotations or experimental data with predictions from pre-trained AI models, while maintaining rigorous statistical guarantees. Discover the theoretical foundations behind "probably approximately correct labels" - a method that ensures with high probability that overall labeling error remains small. Examine practical applications across three domains: text annotation using large language models, image classification with pre-trained vision models, and protein structure analysis with AlphaFold. Understand how this approach enables efficient dataset curation while preserving the reliability needed for machine learning applications, presented as part of the Workshop on Mathematical Foundations of AI by Stanford researcher Tijana Zrnic in collaboration with Emmanuel Candes and Andrew Ilyas.

Syllabus

Tijana Zrnic | Probably Approximately Correct Labels

Taught by

Harvard CMSA

Reviews

Start your review of Probably Approximately Correct Labels

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.