AI Engineer - Learn how to integrate AI into software applications
Free courses from frontend to fullstack and AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
This 10-minute video explains the research paper on Large Language Diffusion Models (LLaDA), exploring how diffusion models are emerging as alternatives to traditional autoregressive approaches for language tasks. Discover the fundamental differences between autoregressive and diffusion approaches, with diffusion models processing language "all-in-one-go" rather than sequentially. Learn about the pre-training process, supervised fine-tuning techniques, and inference methods used in LLaDA. Examine the experimental results that evaluate whether diffusion models could represent the future of large language models given their computational advantages. The video breaks down complex concepts into digestible segments covering motivation, technical approaches, and performance comparisons.
Syllabus
0:00 - Intro
1:23 - Motivation
1:51 - Autoregressive VS Diffusion
4:17 - Pre-training
4:52 - Supervised Fine-tuning
5:24 - Inference
6:51 - Experiments and Results
Taught by
AI Bites