Deep Dive into Large Language Models: From ChatGPT to Training and Applications

Deep Dive into Large Language Models: From ChatGPT to Training and Applications

Andrej Karpathy via YouTube Direct link

00:31:09 GPT-2: training and inference

7 of 24

7 of 24

00:31:09 GPT-2: training and inference

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Deep Dive into Large Language Models: From ChatGPT to Training and Applications

Automatically move to the next video in the Classroom when playback concludes

  1. 1 00:00:00 introduction
  2. 2 00:01:00 pretraining data internet
  3. 3 00:07:47 tokenization
  4. 4 00:14:27 neural network I/O
  5. 5 00:20:11 neural network internals
  6. 6 00:26:01 inference
  7. 7 00:31:09 GPT-2: training and inference
  8. 8 00:42:52 Llama 3.1 base model inference
  9. 9 00:59:23 pretraining to post-training
  10. 10 01:01:06 post-training data conversations
  11. 11 01:20:32 hallucinations, tool use, knowledge/working memory
  12. 12 01:41:46 knowledge of self
  13. 13 01:46:56 models need tokens to think
  14. 14 02:01:11 tokenization revisited: models struggle with spelling
  15. 15 02:04:53 jagged intelligence
  16. 16 02:07:28 supervised finetuning to reinforcement learning
  17. 17 02:14:42 reinforcement learning
  18. 18 02:27:47 DeepSeek-R1
  19. 19 02:42:07 AlphaGo
  20. 20 02:48:26 reinforcement learning from human feedback RLHF
  21. 21 03:09:39 preview of things to come
  22. 22 03:15:15 keeping track of LLMs
  23. 23 03:18:34 where to find LLMs
  24. 24 03:21:46 grand summary

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.