Choosing the Right LLM - A Framework for Model Evaluation and Selection

Build with Azure OpenAI, Copilot Studio & Agentic Frameworks — Microsoft Certified

Learn More →

AI, Data Science & Business Certificates from Google, IBM & Microsoft

Learn More →

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off

One annual plan covers every course and certificate on Coursera. 40% off for a limited time.

Get Full Access

Explore a comprehensive technical talk from the Data Science Festival Oktoberfest 2024 where Emma Mani from Financial Times presents a practical framework for evaluating and selecting Large Language Models (LLMs) for specific tasks. Learn how to move beyond standard benchmarks like Hugging Face's MTEB and vendor-provided metrics to develop custom evaluation methods, demonstrated through real-world summarization tasks at the Financial Times. Dive into technical aspects including article summarization techniques, vectorization, cosine similarity calculations, LLM parameter optimization, and result distribution analysis. Gain insights into creating replicable testing frameworks that can be adapted for various LLM use cases, helping teams make informed decisions when choosing between multiple AI models. Designed for technical practitioners, this 44-minute presentation offers actionable approaches to navigate the growing landscape of LLM options and select the most suitable model for specific organizational needs.