Self-Speculative Masked Diffusions

Learn about self-speculative masked diffusions, a novel approach to masked diffusion generative models that significantly reduces computational requirements for discrete data generation. This 51-minute conference talk by Andrew Campbell from Valence Labs explores how standard masked diffusion models require many simulation steps due to factorization approximations that limit the number of positions that can be sampled simultaneously without degrading quality. Discover how the presented method addresses this limitation by generating non-factorized predictions over masked positions through a modified transformer attention mechanism that switches from non-causal to causal masking. Understand the innovative model-integrated speculative sampling mechanism that enables draft token generation and parallel validation, producing non-factorized predictive distributions in a single forward pass. Examine the practical applications demonstrated on GPT2-scale text modeling and protein sequence generation, where the approach achieves approximately 2x reduction in required network forward passes compared to standard masked diffusion models, making discrete data generation significantly more efficient while maintaining sample quality.