Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Superfast RAG with Llama 3 and Groq - Implementing a Retrieval-Augmented Generation Pipeline

James Briggs via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a 17-minute video tutorial on implementing a Retrieval-Augmented Generation (RAG) pipeline using Meta's Llama 3 70B model via Groq API, an open-source e5 encoder, and Pinecone vector database. Learn how to leverage Language Processing Units (LPUs) for ultra-fast LLM inference, set up Llama 3 in Python, initialize e5 for embeddings, and utilize Pinecone for efficient RAG. Discover the rationale behind concatenating title and content, test RAG retrieval performance, and generate answers using Llama 3 70B. Gain insights into why Groq matters for AI applications and access the provided code repository for hands-on practice.

Syllabus

Groq and Llama 3 for RAG
Llama 3 in Python
Initializing e5 for Embeddings
Using Pinecone for RAG
Why We Concatenate Title and Content
Testing RAG Retrieval Performance
Initialize connection to Groq API
Generating RAG Answers with Llama 3 70B
Final Points on Why Groq Matters

Taught by

James Briggs

Reviews

Start your review of Superfast RAG with Llama 3 and Groq - Implementing a Retrieval-Augmented Generation Pipeline

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.