Collapse of Dense Retrievers and Steering MoE LLMs

Explore critical vulnerabilities in modern AI systems through this seminar examining two key areas of concern in retrieval and language models. Learn how dense retrievers suffer from systematic collapse under heuristic biases, demonstrating a preference for shorter, early, and literal matches rather than factual evidence, which leads to significant failures in Retrieval-Augmented Generation systems. Discover SteerMoE, an innovative framework designed to control Mixture-of-Experts Large Language Models by selectively activating or deactivating behavior-linked experts, which enables enhanced safety and faithfulness while simultaneously exposing potential risks of unsafe steering mechanisms. Gain insights into both the weaknesses and opportunities present in building more robust and controllable AI systems, as presented by Mohsen Fayyaz, a UCLA Ph.D. student specializing in Natural Language Processing with focus on interpretability, explainability, and robustness, whose research encompasses token attribution, metaphors in pre-trained models, probing techniques, dataset pruning, rationale explanations, dense retriever robustness, and steering methodologies for Mixture-of-Experts LLMs.