Scaling Down, Powering Up: Can Efficient Training Beat Scaling Laws?

This conference talk from MLOps World: Machine Learning in Production features Malikeh Ehghaghi, Machine Learning Research Scientist at Vector Institute, challenging traditional beliefs about scaling laws in large language models (LLMs). Explore how innovative strategies prioritizing efficiency and cost-effectiveness can potentially outperform simply scaling up model parameters and data volume. Discover DeepSeek's success story as evidence that thoughtful data engineering and meticulous model design can achieve superior AI performance without prohibitive costs. The 31-minute presentation covers state-of-the-art data-centric approaches (data mixing, filtering, deduplication) and model-centric strategies (pruning, distillation, parameter-efficient finetuning, quantization, model merging) for optimal language model training. Learn about the rise of small language models (SLMs) as cost-efficient alternatives to dense LLMs, and how strategic preparation can produce superior results without massive financial investments traditionally considered necessary for scaling AI systems.