Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Detecting Confident Nonsense - Testing LLM-Driven Apps

Code Sync via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn practical strategies for testing and evaluating LLM-driven applications in this 20-minute conference talk from Code BEAM Europe 2025. Explore the challenges developers face when integrating large language models into products, particularly the problem of "confident nonsense" where AI systems provide fluent but incorrect or potentially harmful responses. Discover evaluation techniques ranging from basic BLEU and ROUGE metrics to more sophisticated aspect-based evaluation and retrieval scoring methods. Understand what metrics to measure, when to trust different evaluation approaches, and how to implement testing strategies that can catch problematic AI responses before they reach production users. Gain insights into building robust validation systems for applications that generate human language, moving beyond traditional unit testing to address the unique challenges of LLM integration.

Syllabus

Detecting Confident Nonsense: Testing LLM-Driven Apps - Hernan Rivas Acosta | Code BEAM Europe 2025

Taught by

Code Sync

Reviews

Start your review of Detecting Confident Nonsense - Testing LLM-Driven Apps

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.