Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Quantization Techniques for Efficient Large Language Model Inference

EDGE AI FOUNDATION via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore advanced quantization techniques for efficient Large Language Model (LLM) inference in this technical seminar presented by Assistant Professor Jungwook Choi from Hanyang University. Delve into the evolution of Transformer models, from their 2012 inception to becoming fundamental in Neural Machine Translation and Natural Language Processing. Learn about the Multi-Head Attention mechanism's role in representation learning and how pre-trained language models have scaled to hundreds of billions of parameters. Understand the challenges of deploying massive Transformer models, including computational demands and power consumption, and discover practical solutions through weight and activation quantization techniques specifically designed for edge device implementation. Gain insights into optimizing model efficiency while maintaining performance across applications in computer vision and voice recognition.

Syllabus

tinyML Asia - Jungwook Choi: Quantization Techniques for Efficient Large Language Model Inference

Taught by

EDGE AI FOUNDATION

Reviews

Start your review of Quantization Techniques for Efficient Large Language Model Inference

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.