Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to build a production-ready RAG (Retrieval-Augmented Generation) AI agent in Python through this comprehensive step-by-step tutorial. Master the essential components needed to deploy AI applications in real-world environments, including observability, logging, retries, throttling, and rate limiting that distinguish production-grade applications from basic tutorials. Explore the complete architecture using modern tools like Inngest for workflow orchestration, Qdrant for vector database management, and LlamaIndex for document processing. Set up your development environment with proper API configurations and Inngest dev server integration. Implement vector database functionality for efficient document storage and retrieval. Process and chunk PDF documents for optimal AI consumption. Build querying capabilities for your vector database to enable intelligent document retrieval. Integrate a user-friendly frontend interface to interact with your RAG system. Apply advanced production concepts including rate limiting, throttling, and concurrency management to ensure your application can handle real-world traffic and usage patterns.
Syllabus
00:00:00 | Overview
00:01:21 | Project Demo
00:04:07 | Architecture & Tools Breakdown
00:08:23 | Project Setup & Dependencies
00:11:22 | API Setup
00:12:10 | Inngest Dev Server Setup
00:25:06 | Vector Database Setup
00:36:48 | Loading & Chunking PDFs
00:58:09 | Querying Our VectorDB
01:08:54 | Adding the Frontend
01:13:56 | Rate Limiting, Throttling & Concurrency
Taught by
Tech With Tim