Fixing Bugs in Gemma, Llama, and Phi 3 - Bug Analysis and Model Optimization

Learn how to identify, analyze, and fix critical bugs in major open-source language models through this conference talk that details the discovery and resolution of 8 bugs in Google's Gemma, multiple tokenization issues in Llama 3, sliding window problems, and the process of "Mistral-fying" Phi-3. Discover the systematic approach to bug detection in open-source models and explore techniques that make fine-tuning 2x faster across all these model architectures. Gain insights into the technical challenges of working with large language models, including VRAM optimization strategies that achieve 70% memory reduction without accuracy loss, and understand the methodologies used by experienced practitioners to enhance model performance and reliability in production environments.