Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

freeCodeCamp

Code DeepSeek V3 From Scratch in Python - Full Course

via freeCodeCamp

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to implement DeepSeek V3 from scratch in this comprehensive 3-hour 47-minute Python course. Follow along as instructor @vukrosic provides both theoretical explanations and hands-on coding instructions for building this cutting-edge deep learning model. Master key concepts including attention mechanisms, Query-Key-Value operations, KV Cache, Multihead Latent Attention (MLA), RoPE (Rotary Position Embedding), Mixture of Experts (MoE), gating mechanisms, and transformer blocks. The course references the DeepSeek V3 paper and provides access to inference code that can be modified for training purposes. Progress through a structured curriculum that begins with fundamental attention concepts and gradually builds toward implementing complete transformer architectures, with practical coding sessions for each component.

Syllabus

⌨️ 0:00:00 Intro
⌨️ 0:01:40 Attention Mechanism
⌨️ 0:13:34 Query, Key, Value
⌨️ 0:34:11 KV Cache
⌨️ 0:39:06 Multihead Latent Attention MLA
⌨️ 0:58:53 Coding MLA
⌨️ 1:28:41 RoPE
⌨️ 1:55:44 Coding KV Cache
⌨️ 2:00:25 MLA forward
⌨️ 2:28:24 MoE, Gate
⌨️ 2:49:25 Gate code
⌨️ 3:09:10 MoE code
⌨️ 3:28:36 Transformer Blocks

Taught by

freeCodeCamp.org

Reviews

Start your review of Code DeepSeek V3 From Scratch in Python - Full Course

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.