Evaluation-Driven Development Workflows - Best Practices and Real-World Scenarios
Databricks via YouTube
The Most Addictive Python and SQL Courses
Build the Finance Skills That Lead to Promotions — Not Just Certificates
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to implement Evaluation-Driven Development (EDD) workflows in enterprise AI systems through this 42-minute conference talk from Databricks. Discover how EDD embeds continuous assessment and improvement into the AI development lifecycle to ensure reliable and efficient systems. Explore techniques for creating high-quality evaluation datasets including document analysis, synthetic data generation using Mosaic AI's synthetic data generation API, subject matter expert validation, and relevance filtering to reduce manual effort and accelerate workflows. Understand key evaluation metrics such as context relevance, groundedness, and response accuracy to identify and address common issues like retrieval errors and model limitations. Master the development of custom LLM judges tailored to domain-specific requirements including PII detection and tone assessment. Gain hands-on insights into leveraging tools like Mosaic AI Agent Framework, Agent Evaluation, and MLflow to automate data tracking, streamline workflows, and quantify improvements. Transform your AI development approach to deliver scalable, high-performing systems that drive measurable organizational value through systematic evaluation practices and real-world implementation scenarios.
Syllabus
Evaluation-Driven Development Workflows: Best Practices and Real-World Scenarios
Taught by
Databricks