Watch a 16-minute conference talk exploring the development and implementation of compact Large Language Models (LLMs) for edge computing devices. Learn about innovative architectural approaches and optimization techniques that enable real-time processing without cloud dependency, while maintaining performance standards. Discover how memory footprint reduction impacts hardware costs and accessibility of AI deployment. Gain insights into the strategic importance of small LLMs, their optimization methods, and practical speed-up examples through the expertise of Syntiant's VP of Product and Business Development. Follow the journey from current state-of-the-art technologies to future visions for edge AI, including discussions on hardware customization and strategic partnerships in the field.

Syllabus

Introduction
sentient journey
compact LLMs
current state of the art
LLM architectures
Optimizations
Speed Up Example
Summary
Future Vision
Strategic Partners