Virtual Experts - Injecting Python into MoE Routing Inside GPT-OSS

Explore an innovative approach to enhancing Mixture of Experts (MoE) models by injecting Python functions as virtual experts in this 10-minute technical demonstration. Learn how to intercept GPT-OSS-20B's routing mechanism in early layers to classify arithmetic tasks and redirect them to a custom Python function that acts as Expert 769, seamlessly integrating with the model's existing 768 neural experts. Discover how this virtual expert successfully solves mathematical problems like 127 × 89 that the original neural experts fail to compute correctly. Examine the technical implementation of math classification and routing, see live demonstrations of different expert types in action, and understand how combining virtual experts with expert pruning can reduce model size by half while improving mathematical performance. Compare this virtual expert approach with traditional tool calling methods and gain insights into the future potential of this new primitive for expanding MoE capabilities beyond purely neural architectures.

Syllabus

- Expert Math Failure
- Virtual Math Expert Demo
- Understanding Virtual Experts
- Math Classification and Routing
- Math Demos
- Different Expert Types
- halving gpt-oss-20b
- tool calling vs virtual experts
- future thoughts