Fused Depthwise Tiling for Memory Optimization in TinyML Deep Neural Network Inference
EDGE AI FOUNDATION via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch a 21-minute research symposium presentation exploring a novel Fused Depthwise Tiling (FDT) method for optimizing memory usage in TinyML deep neural networks. Learn how to deploy deep neural network inference tasks on resource-constrained microcontrollers, focusing on applications like audio keyword detection and radar-based gesture recognition. Discover how FDT reduces memory usage without performance overhead, offering significant improvements over existing tiling methods for a broader range of network layers. Explore the complete end-to-end implementation flow, including a new path discovery method that automates tiling configuration and buffer layout planning. Examine real-world results showing memory reductions of up to 76.2% in evaluated models where traditional methods fell short, while maintaining optimal performance characteristics for TinyML applications.
Syllabus
Intro
Machine Learning on Edge Devices
Challenge: Memory
Intermediate Buffers
Loop Tiling
Fused Tiling
Fused Feature Map Tiling (FFMT)
Fused Depthwise Tiling (FDT)
FDT for Convolutions
End-to-end Deployment Flow
Path Discovery - FDT
Implementation
Summary
Taught by
EDGE AI FOUNDATION