Overview

Transition from theoretical concepts to production-ready engineering in this hands-on course which is the final part in "Fundamentals of Generative AI" specialization. Designed for learners ready to move beyond the theory, this course focuses entirely on construction: you won't just learn about Large Language Models (LLMs); you will build, refine, and deploy them. We start at the foundational level, coding different types of Transformer architectures from scratch using PyTorch. Through high-performance training with Automatic Mixed Precision and ROUGE/BLEU evaluation, you will learn the techniques to scale custom components into optimized systems. By utilizing pre-trained models and weighing performance trade-offs, you will gain the insight needed to select the most efficient path for large-scale deployment. Moving to applied architecture, you will master Retrieval Augmented Generation (RAG) using LangChain, learning to evaluate pipelines and apply advanced techniques such as different chunking strategies, reranking and compression, and query transformation. You'll also navigate model selection as well as the critical trade-offs between RAG and Fine-tuning. Finally, you will step into the future of AI by developing autonomous Agents. You will bridge the gap between development and production by setting up a professional workflow with Poetry and deploying a Summarizer AI Agent directly to the Google Cloud Platform (Vertex AI). By the end of this course, you will possess a tangible portfolio of code and a live deployment, proving your ability to engineer robust Generative AI solutions.

Syllabus

Building Transformer From Scratch

In this module, we dive deep into the Transformer architecture, its core mechanics, and different transformer architecture types (encoder-only, decoder-only, encoder-decoder). We gain hands-on experience by building and training a complete suite of PyTorch-based models from scratch. The module concludes with strategic deployment skills, teaching when to build custom models versus leveraging pre-trained models for efficiency and state-of-the-art results.

Retrieval Augmented Generation (RAG): Bridging the LLM Knowledge Gap

Module 2 addresses the limitations of static knowledge and hallucinations in Large Language Models (LLMs) by introducing Retrieval Augmented Generation (RAG). Learners will progress from building fundamental pipelines with Ollama and LangChain to implementing production-ready systems by adding rigorous RAG evaluation and utilizing advanced techniques such as custom chunking strategies, vector stores, reranking, and query transformations to optimize context retrieval and response generation. The module concludes with an overview of another adaptation technique called finetuning and a comparison of RAG vs. finetuning.

AI Agents with ADK

Module 3 marks a pivotal transition from passive information retrieval to the dynamic realm of autonomous AI Agents, anchored by the "Understand, Think, Take Action" conceptual framework. Students will critically evaluate development ecosystems before applying these concepts to build a functional Summarizer Agent. The module emphasizes professional engineering standards, guiding learners through a complete lifecycle that includes environment management with Poetry, deployment to the Vertex AI Engine, and the implementation of robust performance monitoring using Google Cloud Platform’s logging and tracing tools.