Overview

Master Large Language Models (LLMs) and build sophisticated text generation applications in this hands-on course. You’ll master prompt engineering techniques, optimize model selection and costs, and dive deep into Retrieval-Augmented Generation (RAG), using vector databases to ground AI responses in external data and eliminate hallucinations. Finally, you’ll evaluate system performance with RAGAS and showcase your skills by building an end-to-end RAG application.

Syllabus

Introduction to LLMs and Retrieval Augmented Generation (RAG)

Introduces Large Language Models (LLMs), their core concepts, and the course structure. Covers prerequisites, environment setup, and defines Retrieval-Augmented Generation (RAG).

The Large Language Model Landscape

Explore the four core capabilities of LLMs: generation, summarization, classification, and reasoning. Real-world applications and the importance of RAG for building trust.

Implementing a Chatbot with an LLM

Learn to build a stateful chatbot using an LLM. Covers managing conversation history, using system prompts to define behavior, and understanding message roles (system, user, assistant).

LLM Prompting & Inference

Defines prompt engineering and its components. Explains how to control LLM outputs using inference parameters like temperature, top-p, max tokens, and stop sequences.

Applied Prompting and Inference

Apply prompting techniques hands-on. Implement Chain of Thought (CoT) prompting to improve reasoning and test how different inference parameters change model behavior.

Prompt Instruction Refinement

Explains the theory of systematically refining prompt instructions by modifying components like Role, Task, Context, Examples, and Output Format.

Applying Prompt Instruction Refinement with Python

Provides hands-on practice iteratively refining a prompt to transform a generic recipe analyzer into a precise dietary consultant that produces structured JSON.

Tokens, Embeddings, and Vector Search

Covers the foundations of NLP for LLMs. Defines tokenization, embeddings as semantic vectors, and vector search (similarity search) as the basis for finding relevant information.

Implementing Tokens, Embeddings, and Vector Search

Hands-on practice with tokenization. Implementing embedding generation and vector search to build a semantic search system from scratch.

Strategic Model Selection & Economics

Learn the business trade-offs of model selection. Covers performance, cost, speed, and control (TCO). Compares general-purpose (generation) vs. specialized (reasoning) models.

Applying Model Selection and Economics

Apply model selection theory. Calculate Total Cost of Ownership (TCO) including error costs. Implement a hybrid model routing system to balance cost and quality.

Retrieval Augmented Generation (RAG) Workflow

Introduces the RAG architecture. Compares naive vs. advanced modular RAG. Covers the data ingestion pipeline, focusing on data formats and intelligent chunking strategies.

Semantic Search with Vector Databases for RAG

Explains semantic search and the role of vector databases. Covers indexing algorithms (HNSW) for speed and advanced retrieval techniques like HyDE and re-ranking.

Prompt Engineering for RAG Synthesis

Learn to write prompts for RAG. Covers grounding answers in context, handling conflicts, managing uncertainty, and enforcing verifiability by generating inline citations.

Implementing RAG with Vector Databases

Build a complete RAG system. Practice vector database operations in ChromaDB, including adding documents, applying metadata filters, and implementing a retrieval and generation pipeline.

Evaluating RAG Systems

Learn to evaluate RAG system quality. Introduces key metrics: Context Precision, Context Recall, Faithfulness, and Answer Relevancy. Covers frameworks like RAGAS.

RAG Evaluation Implementation

Implement a RAG evaluation pipeline using the RAGAS framework. Learn to calculate and interpret quality metrics to diagnose and improve your RAG system's performance.