Mixture of Expert MoE - Spectral Decomposition in Orthogonal Subspaces

Explore the next generation of Mixture-of-Expert (MoE) models through this 18-minute video that examines why current MoE architectures underperform compared to classical proprietary large language models and vision-language models. Discover the breakthrough research on SD-MoE (Spectral Decomposition for Effective Expert Specialization) developed by researchers from Fudan University, Tsinghua University, University of Michigan, Carnegie Mellon University, and other leading institutions. Learn about the innovative spectral decomposition approach that enables more effective expert specialization in orthogonal subspaces, addressing fundamental limitations in existing MoE architectures. Gain insights into the mathematical foundations and practical implications of this new methodology that promises to revolutionize how mixture-of-expert models are designed and implemented. The presentation also covers research on "Consistency of Large Reasoning Models Under Multi-Turn Attack" from Carnegie Mellon University, providing additional context on current challenges in large-scale AI model development and robustness.