Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about quantization techniques and key-value (KV) cache optimization methods for improving the efficiency and performance of large language models in this 82-minute lecture from the University of Utah Data Science program, covering fundamental concepts and practical implementations of these memory and computational optimization strategies.