MiniMax-01 Theory Overview - Lightning Attention + MoE + FlashAttention Optimization

MiniMax-01 Theory Overview - Lightning Attention + MoE + FlashAttention Optimization

Yacine Mahdid via YouTube Direct link

- Introduction: 0:00

1 of 12

1 of 12

- Introduction: 0:00

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

MiniMax-01 Theory Overview - Lightning Attention + MoE + FlashAttention Optimization

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Introduction: 0:00
  2. 2 - Model Overview: 3:04
  3. 3 - Main Result Overview: 8:14
  4. 4 - Background Information on Linear Attention: 11:00
  5. 5 - Lightning Attention Overview: 16:07
  6. 6 - I/O Optimization: 22:20
  7. 7 - Pre-training recipe: 25:10
  8. 8 - Post-training recipe: 26:31
  9. 9 - Full Results: 30:42
  10. 10 - Vision Modality for MiniMax-VL-01: 37:24
  11. 11 - Demo of MiniMax-text-01: 41:20
  12. 12 - Final Words: 45:04

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.