Turbocharging Foundation Model Training: Cutting-Edge Strategies for Faster Convergence

This conference talk explores advanced techniques for reducing training time and accelerating convergence of large-scale foundation models without compromising quality. Presented by Rahul Raja, Staff Software Engineer at LinkedIn who specializes in search, machine learning infrastructure, and recommender systems, the 30-minute session delves into strategies to optimize resource-intensive training processes. Learn about dynamic batching, mixed precision training, curriculum learning, gradient accumulation, adaptive optimization, efficient data pipelines, and distributed training frameworks. Gain actionable insights and best practices to improve model training efficiency, enabling faster development cycles and more scalable foundation models. Raja brings valuable expertise from his work with vector search, AI-powered recommendations, and large-scale ML systems, as well as his contributions to research in LLMs, NLP, and generative AI.