Build the Finance Skills That Lead to Promotions — Not Just Certificates
The Investment Banker Certification
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about DecDEC, a novel inference scheme that enhances the quality of low-bit quantized Large Language Models while maintaining the memory and latency benefits of quantization. Discover how this systems approach stores residual matrices in CPU memory and dynamically fetches corrections for salient channels identified by activation outliers, enabling adaptive error compensation that responds to the dynamic nature of activation distributions. Explore the technical implementation that achieves significant perplexity improvements, such as reducing a 3-bit Llama-3-8B-Instruct model's perplexity from 10.15 to 9.12 while adding minimal GPU memory overhead and only 1.7% inference slowdown, making it particularly valuable for on-device deployment scenarios with limited hardware resources.
Syllabus
OSDI '25 - DecDEC: A Systems Approach to Advancing Low-Bit LLM Quantization
Taught by
USENIX