Completed
18:30 Water, reflections, and “CG look” — who feels more natural
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Qwen vs FLUX Guide - Architecture, VAE Quality, Speed, and Use Cases
Automatically move to the next video in the Classroom when playback concludes
- 1 0:00 Intro: two open image models and what we’ll compare
- 2 0:56 Qwen-Image overview: 20B MMDiT, native text, dual-encoding
- 3 1:40 FLUX.1 Kontext overview: rectified flow, sequence concat, 3D RoPE, LADD
- 4 2:25 FLUX text stack: CLIP ViT-L/14 + T5-XXL, token limits
- 5 3:04 Why CLIP needs T5: 77-token ceiling vs 256/512 prompts
- 6 3:57 Qwen text stack: Qwen2.5-VL front end, 512-token prompts, VLM frozen for edits
- 7 4:27 Bottom line on prompts & bilingual text: why Qwen excels for documents
- 8 5:03 VAE 101: latent denoising and decoding back to pixels
- 9 5:40 Why VAE quality matters: crisp glyphs, micro-detail, layout preservation
- 10 6:23 Takeaway: Qwen for tiny fonts; Kontext for fast multi-turn identity
- 11 6:55 First impressions: from ControlNet to Kontext & Qwen
- 12 7:54 Editing approaches: Qwen dual-path semantics + appearance vs Kontext unified
- 13 9:04 Who wins where: text fidelity vs character consistency & speed
- 14 9:15 Training notes: coarse→fine text curriculum multi-pass idea
- 15 10:46 Practical picks: when to choose Qwen vs Kontext
- 16 11:23 Case study: library scene — detail & fidelity comparisons
- 17 12:36 Inpainting test: Pikachu on shoulder — preservation vs saturation
- 18 13:57 Kontext vs Qwen: subject integrity and color differences
- 19 15:29 3D model rotation test: textures, fur, and rock detail
- 20 17:07 Multi-model image comparisons: Gemini, ImageFX, OpenAI, FLUX
- 21 18:30 Water, reflections, and “CG look” — who feels more natural
- 22 21:14 Portrait test: street blur, photoreal modes, dripping artifact
- 23 22:30 Character consistency across poses — limits & prompt issues
- 24 23:01 Final verdict: pick the right tool; links & subscribe