Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Mixtral of Experts - Paper Explained

Yannic Kilcher via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore an in-depth analysis of the Mixtral of Experts paper in this comprehensive video lecture. Delve into the intricacies of Sparse Mixture of Experts (SMoE) language models, comparing Mixtral 8x7B's architecture to Mistral 7B and examining its performance against Llama 2 70B and GPT-3.5. Learn about expert routing, sparse expert routing, and expert parallelism. Discover the experimental results, routing analysis, and conclusions drawn from this groundbreaking research in natural language processing and artificial intelligence.

Syllabus

- Introduction
- Mixture of Experts
- Classic Transformer Blocks
- Expert Routing
- Sparse Expert Routing
- Expert Parallelism
- Experimental Results
- Routing Analysis
- Conclusion

Taught by

Yannic Kilcher

Reviews

Start your review of Mixtral of Experts - Paper Explained

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.