Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

MoNaCo - Natural Questions for Deep Reasoning Across Dozens of Documents

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a comprehensive research presentation introducing MoNaCo, a groundbreaking benchmark designed to evaluate the question-answering capabilities of large language models when handling complex, multi-document reasoning tasks. Learn about the development of 1,315 challenging information-seeking questions that require synthesizing and reasoning across dozens of Wikipedia tables and passages, addressing a critical gap in current evaluation methodologies. Discover the performance results of 15 frontier LLMs including GPT-5, o3, Claude Opus 4, Gemini 2.5 Pro, and Deepseek-R1, with the top-performing model achieving only 38.7% perfect scores. Understand how this benchmark reveals that factuality remains a significant challenge for LLMs despite the saturation of many existing factual QA benchmarks, and gain insights into the limitations of current AI systems in handling real-world information synthesis problems similar to those tackled by tools like Deep Research.

Syllabus

MoNaCo: Natural Questions for Deep Reasoning Across Dozens of Documents - Tomer Wolfson

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of MoNaCo - Natural Questions for Deep Reasoning Across Dozens of Documents

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.