Unlocking Local LLMs with Quantization
Earn a Michigan Engineering AI Certificate — Stay Ahead of the AI Revolution
NY State-Licensed Certificates in Design, Coding & AI — Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about quantization's evolution and its impact on local Large Language Models in this 40-minute conference talk from Hugging Face's Marc Sun. Explore the journey of quantization through influential papers like QLoRA and GPTQ, and discover its practical applications across different stages of model development. Gain insights into pre-training a 1.58-bit model, implementing fine-tuning techniques with PEFT + QLoRA, and optimizing inference performance using torch.compile or custom kernels. Understand how the open-source community is making quantized models more accessible through transformers and GGUF models from llama.cpp, enabling broader adoption of local LLM implementations.
Syllabus
Unlocking Local LLMs with Quantization - Marc Sun, Hugging Face
Taught by
Linux Foundation