How AI Videos Actually Work - Diffusion Models, CLIP, and the Math of Turning Text into Images

How AI Videos Actually Work - Diffusion Models, CLIP, and the Math of Turning Text into Images

3Blue1Brown via YouTube Direct link

Per Chenyang Yuan: at 10:15, the blurry image that results when removing random noise in DDPM is probably due to a mismatch in noise levels when calling the denoiser. When the denoiser is called on x…

14 of 16

14 of 16

Per Chenyang Yuan: at 10:15, the blurry image that results when removing random noise in DDPM is probably due to a mismatch in noise levels when calling the denoiser. When the denoiser is called on x…

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

How AI Videos Actually Work - Diffusion Models, CLIP, and the Math of Turning Text into Images

Automatically move to the next video in the Classroom when playback concludes

  1. 1 0:00 - Intro
  2. 2 3:37 - CLIP
  3. 3 6:25 - Shared Embedding Space
  4. 4 8:16 - Diffusion Models & DDPM
  5. 5 11:44 - Learning Vector Fields
  6. 6 22:00 - DDIM
  7. 7 25:25 Dall E 2
  8. 8 26:37 - Conditioning
  9. 9 30:02 - Guidance
  10. 10 33:39 - Negative Prompts
  11. 11 34:27 - Outro
  12. 12 35:32 - About guest videos + Grant’s Reaction
  13. 13 6:15 CLIP: Although directly minimizing cosine similarity would push our vectors 180 degrees apart on a single batch, overall in practice, we need CLIP to maximize the uniformity of concepts over the…
  14. 14 Per Chenyang Yuan: at 10:15, the blurry image that results when removing random noise in DDPM is probably due to a mismatch in noise levels when calling the denoiser. When the denoiser is called on x…
  15. 15 For the vectors at 31:40 - Some implementations use fx, t, cat + alphafx, t, cat - fx, t, and some that do fx, t + alphafx, t, cat - fx, t, where an alpha value of 1 corresponds to no guidance. I cho…
  16. 16 At 30:30, the unconditional t=1 vector field looks a bit different from what it did at the 17:15 mark. This is the result of different models trained for different parts of the video, and likely a re…

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.