Evals 101 - AI Evaluation Lifecycle with Braintrust

Learn to build comprehensive AI evaluation systems through this hands-on workshop that covers the complete evaluation lifecycle using Braintrust. Master the fundamentals of AI model assessment by exploring initial prompt testing methodologies, developing robust evaluation frameworks, and implementing both offline and online evaluation strategies. Discover how to set up effective logging systems for production monitoring and gain practical experience with real-world evaluation scenarios. The session provides step-by-step guidance on creating evaluation pipelines that ensure AI system reliability and performance optimization. Understand best practices for measuring model outputs, tracking performance metrics over time, and maintaining quality standards in production environments. Gain insights from Doug Guthrie, a solutions engineer at Braintrust with extensive experience in data infrastructure deployment, as he demonstrates practical techniques for evaluating AI systems at scale.