This 36-minute AWS Events talk explores how knowledge distillation can transfer capabilities from large language models to smaller, faster models while maintaining performance. Discover how organizations can achieve significant improvements in throughput and cost efficiency through distillation techniques. Learn implementation methods using Amazon Bedrock or how to build custom solutions on Amazon SageMaker. Watch Julien Simon demonstrate how Arcee AI leverages distillation to develop industry-leading small language models (SLMs) based on open architectures. Get introduced to the open-source DistillKit library and see demonstrations of newly distilled SLMs from Arcee AI. Featuring insights from AWS experts Laurens van der Maas, Aleksandra Dokic, and Jean Launay Orlanda, this presentation provides practical knowledge for optimizing AI model deployment.