Learn how to accelerate mixture of experts training using rail-optimized InfiniBand networking infrastructure in this 18-minute conference talk from the AI Engineer World's Fair. Discover the networking challenges that arise when training state-of-the-art machine learning models that use mixture of experts techniques, which distribute model layers across multiple neural networks to enable more efficient training of larger-scale models. Explore Crusoe Cloud's high-performance InfiniBand network architecture specifically designed to handle the sparse distribution of model state that puts increasing pressure on cluster-level networking during training. Understand the "rail-optimized" design approach that reduces hops between GPU sets in clusters, accelerates all-to-all performance, and ultimately reduces training time. Gain insights into utilizing these specialized networking solutions to optimize your own training workloads and improve the efficiency of large-scale machine learning model training.