Overview

This course covers the fundamentals and applications of sequence modeling. The course begins with an overview of sequence models and their significance, followed by hands-on lessons to tokenize text and develop embeddings using PyTorch. Participants will explore recurrent neural networks (RNNs) and their variants, including LSTMs and GRUs, progressing to Seq2Seq models and the implementation of attention mechanisms. The course culminates in a comprehensive understanding of transformers, self-attention, and industry evaluation practices. By the end, students will build a transformer-based Q&A system, solidifying their grasp of modern NLP frameworks.

Syllabus

Course Overview

Get introduced to Creating Sequence Models and Transformers, meet your instructor, and discover the course objectives.

Why Sequence Models Matter

Discover why the order of data matters, real-world applications of sequence models, key challenges, and foundational techniques for processing and modeling sequential information.

Breaking Text into Tokens

Learn how text is broken into tokens, assigned numerical IDs, and prepared for AI models. Explore tokenization strategies and their impact on model performance and costs.

Creating a Tokenizer with PyTorch

Learn how to build a tokenizer in PyTorch, transforming text into numerical data for NLP models, and compare simple word tokenization with advanced subword techniques used in modern AI.

Finding Meaning with Word Embeddings

Learn how word embeddings transform words into vectors, letting machines grasp meaning, relationships, and analogies for smarter NLP applications like search, recommendations, and translation.

Pretrained Embeddings with PyTorch

Learn to use pre-trained word embeddings in PyTorch, explore GloVe vectors, visualize relationships, solve analogies, and understand static versus contextual embeddings in NLP.

Sequence Modeling with RNNs, LSTMs, and GRUs

Learn how RNNs, LSTMs, and GRUs model sequences, overcome memory limitations, and enable applications like text prediction, translation, and time series forecasting.

Predicting Sequences with RNN Variants in PyTorch

Discover how RNN, LSTM, and GRU networks in PyTorch excel at predicting sequences, comparing their memory, accuracy, and efficiency in character-level text tasks.

Translating Ideas with Seq2Seq Models

Explore how Seq2Seq models use encoder-decoder architectures to transform sequences, enabling tasks like translation, summarization, and chatbot responses.

Building Your First Seq2Seq Network with PyTorch

Build a Seq2Seq model in PyTorch for tasks like translation and Q&A, learning about encoders, decoders, teacher forcing, and the context vector bottleneck.

Focusing Model Attention on What Matters

Learn how attention mechanisms let AI models focus on relevant input parts, overcoming Seq2Seq bottlenecks and improving tasks like translation and summarization.

Enhancing Seq2Seq with Attention in PyTorch

Learn to enhance Seq2Seq models in PyTorch using attention mechanisms, robust QA evaluation with EM and F1, and data-centric strategies to address complex language tasks.

Transformers and the Power of Self-Attention

Explore Transformer architecture, self-attention, multi-head attention, and positional encoding to understand how modern AI models process language efficiently and contextually.

Exploring Transformers with HuggingFace

Discover how Transformers process language by exploring tokenization, contextual embeddings, and attention visualization using HuggingFace and BERT for model interpretability.

Evaluating Transformer Models: Concepts and Industry Practices

Learn to rigorously evaluate Transformer models using metrics like EM, F1, BLEU, and ROUGE, and qualitative error analysis for reliable, real-world language tasks.

Implementing Evaluation Metrics with Hugging Face

Learn to implement and interpret evaluation metrics like EM, F1, Precision@k, and MRR for QA models using Hugging Face, and perform error analysis to guide model improvement.

Fine-Tuning Transformers for Q&A

Learn to fine-tune pretrained transformers for extractive Q&A: prepare data, train with Hugging Face, evaluate using SQuAD, EM, and F1, and apply techniques in real-world scenarios.

Comparing RNNs and Transformers

Compare RNNs (sequential processing) and Transformers (parallel self-attention) for sequence tasks in AI, focusing on strengths, limits, and real-world applications.

Project: Build a Transformer-Based Q&A System

You will build a semantic retriever for an internal AI assistant. You’ll use transformer embeddings to encode queries and documents and return top-k results for LLM answers.