Google, IBM & Meta Certificates — 40% Off for a Limited Time
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how to avoid performance pitfalls when deploying AI models on edge devices by understanding the critical relationship between model architecture and hardware optimization. Discover why theoretical efficiency metrics like FLOPs and parameters often fail to predict real-world performance, using the surprising example of how MobileNet V2 runs slower than ResNet18 on GPUs despite being "more efficient." Explore hardware selection strategies where NPUs can outperform GPUs despite lower TOPS ratings due to factors like operator support, kernel fusion, and memory behavior. Master a four-step framework for hardware-aware development: profiling on real devices from day one, verifying operator compatibility early, automating bottleneck discovery in CI pipelines, and optimizing with hardware-specific techniques like targeted pruning and mixed precision. Examine a practical case study of Llama 3.2-1B optimization on Snapdragon Gen 3, achieving 31% faster token generation, 25% faster prompt processing, and 126% faster initialization with minimal accuracy loss through strategic hardware-aware optimization techniques.
Syllabus
The Optimization Trap in Edge AI
Taught by
EDGE AI FOUNDATION