Dynamic Neural Network Compression for Scalable AI Deployment
Association for Computing Machinery (ACM) via YouTube
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore dynamic neural network compression techniques in this 56-minute conference talk that introduces a revolutionary framework for real-time compression and decompression of deep learning models. Learn how cutting-edge AI systems can dynamically adapt to changing hardware, bandwidth, and task demands through innovative compression methods. Discover practical insights into scalable AI deployment across both cloud and edge environments using a new system and file format called Automatic Decompression Instructions (.adi). Gain understanding of how this framework addresses the critical challenge of deploying large AI models that need to efficiently scale across diverse computing environments while maintaining performance. The presentation covers the technical foundations of dynamic compression, real-world implementation strategies, and the potential impact on AI deployment scalability across different platforms and resource constraints.
Syllabus
Dynamic Neural Network Compression for Scalable AI Deployment
Taught by
Association for Computing Machinery (ACM)