Efficient Streaming Language Models with Attention Sinks - Paper Explained

Efficient Streaming Language Models with Attention Sinks - Paper Explained

Yannic Kilcher via YouTube Direct link

- Introduction

1 of 9

1 of 9

- Introduction

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Efficient Streaming Language Models with Attention Sinks - Paper Explained

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Introduction
  2. 2 - What is the problem?
  3. 3 - The hypothesis: Attention Sinks
  4. 4 - Experimental evidence
  5. 5 - Streaming LLMs
  6. 6 - Semantics or position?
  7. 7 - Can attention sinks be learned?
  8. 8 - More experiments
  9. 9 - Comparison to Big Bird

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.