Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about handling long sequences in natural language processing through a comprehensive graduate-level lecture that examines transformer model complexity, extrapolation capabilities of trained models, and alternative architectural approaches. Explore the computational and memory requirements of transformer models, understand how models perform when processing sequences beyond their training length, and discover various non-attentional model designs. Delve into methods for evaluating long-context models and gain insights into different transformer architectures optimized for processing extended sequences. Part of Carnegie Mellon University's Advanced NLP course taught by Graham Neubig, this lecture provides essential knowledge for working with lengthy text sequences in modern NLP applications.
Syllabus
CMU Advanced NLP Fall 2024 (13): Long Sequence Models
Taught by
Graham Neubig