Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CodeSignal

Benchmarking LLMs on Text Generation

via CodeSignal

Overview

This course explores benchmarking for open-ended generation tasks like summarization. You'll experiment with different prompting styles, compare models like GPT-3.5 and GPT-4, and evaluate results using both fuzzy string similarity and semantic similarity via embeddings.

Syllabus

  • Unit 1: Prompting for Summarization with LLMs
    • Reading CSV Files with Python
    • Simplifying Prompts for Better Summaries
    • Crafting Custom Prompts for Better Summaries
  • Unit 2: Scoring and Comparing Models with ROUGE
    • Loading Data for ROUGE Evaluation
    • Setting Up ROUGE for Summary Evaluation
    • Evaluating Summaries with ROUGE Metrics
    • Evaluating GPT-4 Summarization with ROUGE-L
  • Unit 3: Semantic Evaluation with Embeddings
    • Implementing Cosine Similarity for Vector Comparison
    • Creating Text Embeddings with OpenAI
    • Building a Semantic Comparison Pipeline

Reviews

Start your review of Benchmarking LLMs on Text Generation

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.