Faster, Smaller, Smarter - How Liquid AI Is Redefining LLM Efficiency

Explore the evolution of small language models (SLMs) and discover why architectural innovation combined with high-quality data now outweighs raw parameter counts in this 50-minute AI discussion. Learn about hybrid transformer/convolutional designs and neural architecture search (NAS) techniques that enable efficient models capable of running locally on consumer GPUs and modern laptops without compromising performance. Dive deep into pre-training and post-training methodologies, examining why data quality remains paramount and understanding when different optimizers like Adam versus Muon excel based on specific architectures. Master practical optimization methods including layer freezing techniques using Spectrum and navigate the trade-offs involved in LoRA implementations. Investigate advanced distillation approaches comparing token-probability matching against hidden-state matching, explore the realities of RLFT for small language models, and discover the future landscape of on-device, agentic AI systems designed for private, offline assistance with email management, calendar coordination, research tasks, and beyond.