Learn Backend Development Part-Time, Online
Google AI Professional Certificate - Learn AI Skills That Get You Hired
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Learn how to transition from intuitive "prompt engineering" to a structured, data-driven framework for evaluating large language models in this 13-minute conference talk from Fully Connected London '25. Discover Learnosity's systematic approach to designing, testing, and validating AI educational products for quality and reliability. Explore practical techniques including LLM-as-a-Judge methodologies, synthetic data generation strategies, and W&B Weave-based evaluation pipelines that ensure accuracy, fairness, and trust in real-world learning applications. Gain insights into building robust evaluation systems that move beyond subjective assessment to deliver measurable, reliable AI performance in educational technology contexts.
Syllabus
Beyond the vibes: Learnosity’s journey to a robust LLM evaluation framework - FC London '25
Taught by
Weights & Biases