Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Deepseek V3 Architecture and Performance Optimization - From Training to Deployment

Trelis Research via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a comprehensive technical lecture that delves into the groundbreaking advancements of Deepseek v3, focusing on its performance improvements and innovative approaches to compute efficiency. Learn about detailed performance comparisons with Claude Sonnet and GPT-4o, including speed tests and deployment requirements for self-hosting. Understand the implications of GPU types and export restrictions, while diving deep into training efficiency improvements and the evolution of model architecture from 2022-2024. Master the Mixture of Experts concept and its associated load balancing challenges, along with Deepseek's novel auxiliary loss-free solution. Discover three key optimization techniques: FP8 training, Multi-Query Latent Attention (MLA), and multi-token prediction. Gain insights into 8-bit training, compressed attention mechanisms, and the benefits of speculative decoding, all presented with practical examples and technical depth.

Syllabus

- Deepseek V3 performance
- Performance comparison with Claude Sonnet and GPT-4o
- Speed tests vs Sonnet and GPT-4o
- Discussion of model size and deployment requirements for self-hosting
- Analysis of GPU types and export restrictions
- Explanation of training efficiency improvements
- Overview of model architecture evolution over 2022-2024
- Introduction of Mixture of Experts concept
- Discussion of load balancing problems
- Explanation of Deepseek's load balancing solution auxiliary loss free approach
- Introduction of three additional Deepseek optimisation techniques FP8 training, MLA, Multi-token Prediction.
- Discussion of 8-bit training
- Explanation of compressed attention MLA, latent attention
- Details of multi-token prediction
- Benefits of speculative decoding
- Conclusion and summary of Deepseek improvements

Taught by

Trelis Research

Reviews

Start your review of Deepseek V3 Architecture and Performance Optimization - From Training to Deployment

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.