Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

freeCodeCamp

Train Your Own LLM - Tutorial

via freeCodeCamp

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to train a language model from scratch in this comprehensive 3.5-hour tutorial guided by Imad Saddik. Master the complete process using Moroccan Darija as an example, covering essential topics including loading text data, training a tokenizer with Byte Pair Encoding, understanding Transformer architecture, pre-training models, creating supervised fine-tuning datasets, and building your own AI assistant. Access all resources including code, notebooks, datasets, and tokenizers through the provided GitHub repositories and Hugging Face links. The tutorial progresses through structured sections from basic concepts to advanced scaling techniques, making it accessible for beginners while providing practical implementation experience.

Syllabus

0:00:00 About the Course
0:03:03 Introduction
0:07:24 Training Data
0:15:33 Tokenization
0:29:00 The Transformer Architecture
0:52:21 Pre-training
1:24:46 Fine-tuning Dataset
1:33:05 Instruction Fine-tuning
2:06:17 Fine-tuning with LoRA
2:20:39 Let's Scale Everything
3:09:40 Bonus
3:27:10 Conclusion

Taught by

freeCodeCamp.org

Reviews

Start your review of Train Your Own LLM - Tutorial

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.