Master Agentic AI, GANs, Fine-Tuning & LLM Apps
Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore comprehensive strategies for evaluating, testing, and securing Large Language Model applications in this 48-minute conference talk from NDC Oslo 2025. Learn how to measure the effectiveness of prompt changes and Retrieval-Augmented Generation (RAG) pipeline modifications through various evaluation frameworks including Vertex AI Evaluation, DeepEval, and Promptfoo. Discover the essential metrics these frameworks provide and understand their practical applications in assessing LLM outputs. Delve into critical security considerations for LLM applications, including protection against prompt injections and prevention of harmful responses through robust input and output guardrails that extend beyond basic safety settings. Examine testing and security frameworks such as LLM Guard to ensure your applications remain safe and operate within precisely defined parameters, providing you with the knowledge to build more reliable and secure LLM-powered solutions.
Syllabus
Beyond the Prompt: Evaluating, Testing, and Securing LLM Applications - Mete Atamel - NDC Oslo 2025
Taught by
NDC Conferences