Learn how to extend the context length of Large Language Models (LLMs) during inference through a technical deep dive video that introduces grouped self-attention as an alternative to classical transformer self-attention mechanisms. Explore the challenges of out-of-distribution issues related to positional encoding when LLMs process text sequences beyond their pre-training context window. Examine implementation details, smooth transition techniques, and benchmark data while following along with code demonstrations based on the research paper "LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning." Master practical solutions for handling longer sequences in neural networks without requiring model retraining or fine-tuning.

Syllabus

Introduction
Theory
Main idea
Implementation
SelfExtend LLM
Deep Dive
Smooth Transition
Benchmark Data
Publication
Code Implementation

Taught by

Discover AI

Reviews

Start your review of Self-Extending LLM Context Windows Using Grouped Self-Attention

Get 20% off all career paths from fullstack to AI

Learn AI, Data Science & Business — Earn Certificates That Get You Hired

Taught by

2,000+ Free Courses with Certificates: Coding, AI, SQL, and More

Transformer Models with PyTorch

INFINI Attention: Efficient Infinite Context Transformers with 1 Million Token Context Length

Transformer Self-Attention - Calculating Attention Scores - LLM Series Lecture 7

What is the Transformers' Context Window in Deep Learning and How to Make it Long

LLM Foundations - LLM Bootcamp

AI Engineer - Learn how to integrate AI into software applications Ad

14 Best Artificial Intelligence Courses for 2026

16 Best Machine Learning Courses for 2026: Scikit-learn, TensorFlow, and more

[2026] Harvard CS50 Guide: How to Pick the Right Course (with Free Certificate)

12 Best Generative AI Courses of 2026 — Based on Your Profession

[2026] 140+ Universities Just Launched 900+ Online Courses. Here’s the Full List.

Never Stop Learning.