Finance Certifications Goldman Sachs & Amazon Teams Trust
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
This conference talk from SREcon25 Americas explores the challenges and solutions for safely evaluating and deploying AI models in production environments. Brendan Burns from Microsoft shares practical insights from the development of Azure Copilot, focusing on the unique reliability challenges posed by AI systems. Learn how to implement effective evaluation frameworks for new models and prompts where performance isn't simply "working" or "broken" but requires probabilistic assessment across numerous user interactions. Discover methodologies for determining when model changes represent improvements versus regressions that require fixes or rollbacks. The presentation provides hands-on approaches currently used in production systems to maintain reliability when AI models form core components of user experiences.
Syllabus
SREcon25 Americas - Safe Evaluation and Rollout of AI Models
Taught by
USENIX