Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Udemy

RAG-LLM Evals & Test Automation for Beginners

via Udemy

Overview

Understand, Evaluate & Test RAG - LLM's (AI based Systems) from Scratch using RAGAS-Python-Pytest Framework

What you'll learn:
  • How Custom Large Language Models (LLM) are designed using Retrieval Augmented Generation (RAG) Architecture
  • Common Benchmarks/Metrics used in Evaluating RAG based LLM’s
  • Introduction to RAGAS Evaluation framework for evaluating/test LLM’s
  • Practical Scripts generation to automate and assert the Metrics Score of LLM’s.
  • Automate Scenarios such as Single turn interactions and Multi turn interactions with LLM’s using RAGAS Framework
  • Generate Test Data for evaluating the Metrics of LLM using RAGAS Framework.
  • Create RAGAS Pytest Evaluation Framework to assert the Metrics of RAG- (Custom) LLM’s

LLMs are everywhere! Every business is building its own custom AI-based RAG-LLMs to improve customer service. But how are engineers testing them? Unlike traditional software testing, AI-based systems need a special methodology for evaluation.

This course starts from the ground up, explaining the architecture of how AI systems (LLMs) work behind the scenes. Then, it dives deep into LLM evaluation metrics.

This course shows you how to effectively use the RAGAS framework library to evaluate LLM metrics through scripted examples. This allows you to use Pytest assertions to check metric benchmark scores and design a robust LLM Test/evaluation automation framework.


What will you learn from the course?

  • High level overview on Large Language Models (LLM)

  • Understand how Custom LLM’s are built using Retrieval Augmented Generation (RAG) Architecture

  • Common Benchmarks/Metrics used in Evaluating RAG based LLM’s

  • Introduction to RAGAS Evaluation framework for evaluating/test LLM’s

  • Practical Scripts generation to automate and assert the Metrics Score of LLM’s.

  • Automate Scenarios such as Single turn interactions and Multi turn interactions with LLM’s using RAGAS Framework

  • Generate Test Data for evaluating the Metrics of LLM using RAGAS Framework.


By end of the course, you will be able to create RAGAS Pytest Evaluation Framework to assert the Metrics of RAG- (Custom) LLM’s


Important Note:

This course covers Top 7 Metrics which are commonly used to Evaluate and test the LLM’s. Same logic can be applied to rest of any other metric evaluations.


Handson Experience:

Course provides Practice RAG -LLM for you for Handson, but at scripting phase, you need a basic subscription of Open AI to access their API’s (Minimal 10$ credit will suffice)


Course Prerequisites:

  • Python, PyTest basics are required to understand the Framework.

    We have 2 dedicated sections at the end of this course which gives you necessary knowledge on Python & Pytest required to follow the course.

  • Basic knowledge on API Testing.



Syllabus

  • Introduction to AI concepts - LLM's & RAG LLM's
  • Understand RAG (Retrieval Augmented Generation) - LLM Architecture with Usecase
  • Getting started with Practice LLM's and the approach to evaluate /Test
  • Setup Python & Pytest Environment with RAGAS LLM Evaluation Package Libraries
  • Programmatic solution to evaluate LLM Metrics with Langchain and RAGAS Libraries
  • Optimize LLM Evaluation tests with Pytest Fixtures & Parameterization techniques
  • Evaluate LLM Core Metrics and importance of EvalDataSet in RAGAS Framework
  • Upload LLM Evaluation results & Test LLM for Multi Conversational Chat History
  • Create Test Data dynamically to evaluate LLM & Generate Rubrics Evaluation Score
  • Conclusion and next steps!
  • Optional - Learn Python Fundamentals with examples
  • Optional - Overview of Pytest Framework basics with examples
  • Bonus Lecture

Taught by

Rahul Shetty Academy

Reviews

4.6 rating at Udemy based on 897 ratings

Start your review of RAG-LLM Evals & Test Automation for Beginners

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.