Towards Robust GenAI: Techniques for Evaluating Enterprise LLM Applications
MLOps World: Machine Learning in Production via YouTube
Master AI and Machine Learning: From Neural Networks to Applications
Get 20% off all career paths from fullstack to AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore techniques for evaluating enterprise LLM applications in this 45-minute conference talk from MLOps World: Machine Learning in Production. Delve into the challenges of assessing performance and safety in increasingly capable language models. Examine the limitations of traditional human evaluation methods and their impact on enterprise AI adoption. Discover emerging automated evaluation solutions that combine real-time "micro evaluators" with strategic human feedback loops. Learn how to gain constant insights into a model's strengths, weaknesses, and blind spots. By the end of the talk, acquire strategies to confidently implement language models in your applications and products, enhancing the robustness of your generative AI systems.
Syllabus
Towards Robust GenAI: Techniques for Evaluating Enterprise LLM Applications
Taught by
MLOps World: Machine Learning in Production