Nugget: Neural Agglomerative Embeddings of Text
Center for Language & Speech Processing(CLSP), JHU via YouTube
Learn Backend Development Part-Time, Online
AI Engineer - Learn how to integrate AI into software applications
Overview
Build a Learning Habit
Download Class Central's free printable study calendar
Download for Free
Explore a novel approach to text embedding called Nugget in this 37-minute conference talk by Guanghui Qin from the Center for Language & Speech Processing at Johns Hopkins University. Learn how Nugget addresses the limitations of constant-size representations by dynamically encoding language into meaningful units based on a subset of input tokens. Discover how this method outperforms existing approaches in semantic comparison tasks and offers potential for expanding the contextual window of language models. Gain insights into the training process of Nugget through tasks like autoencoding and machine translation, and understand its implications for future language models that can process significantly larger amounts of content.
Syllabus
Nugget: Neural Agglomerative Embeddings of Text - Guanghui Qin
Taught by
Center for Language & Speech Processing(CLSP), JHU