Earn Your CS Degree, Tuition-Free, 100% Online!
Python, Prompt Engineering, Data Science — Build the Skills Employers Want Now
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore advanced quantization techniques for efficient Large Language Model (LLM) inference in this technical seminar presented by Assistant Professor Jungwook Choi from Hanyang University. Delve into the evolution of Transformer models, from their 2012 inception to becoming fundamental in Neural Machine Translation and Natural Language Processing. Learn about the Multi-Head Attention mechanism's role in representation learning and how pre-trained language models have scaled to hundreds of billions of parameters. Understand the challenges of deploying massive Transformer models, including computational demands and power consumption, and discover practical solutions through weight and activation quantization techniques specifically designed for edge device implementation. Gain insights into optimizing model efficiency while maintaining performance across applications in computer vision and voice recognition.
Syllabus
tinyML Asia - Jungwook Choi: Quantization Techniques for Efficient Large Language Model Inference
Taught by
EDGE AI FOUNDATION