Going Beyond RAG - Extended Mind Transformers

Explore a revolutionary approach to transformer architecture that challenges traditional Retrieval Augmented Generation methods in this 16-minute conference talk. Learn about Extended Mind Transformers (EMT), an innovative transformer variant that leverages the model's inherent key/query mechanism to dynamically select and attend to relevant information during each generation step, rather than relying on embedding-based document retrieval. Discover how this architecture achieves state-of-the-art performance in long context applications while addressing fundamental limitations of RAG systems. Examine the key design decisions behind EMT implementation, including extended mind attention mechanisms, evaluation methodologies, and strategies for reducing hallucinations. Gain insights into the mathematical foundations and practical applications of this approach, with access to implementation resources through GitHub repositories and Hugging Face model collections for hands-on experimentation.