Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

BanditSpec - Adaptive Speculative Decoding via Bandit Algorithms

Centre for Networked Intelligence, IISc via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Attend this academic seminar exploring BanditSpec, an innovative training-free online learning framework that adaptively optimizes speculative decoding configurations for Large Language Models using bandit algorithms. Learn how this approach formulates hyperparameter selection as a Multi-Armed Bandit problem to accelerate LLM inference while maintaining text generation quality. Discover the UCBSpec and EXP3Spec algorithms designed for both stochastic and adversarial reward settings, with theoretical analysis demonstrating optimal regret performance. Examine extensive empirical experiments with LLaMA3 and Qwen2 models that validate the framework's effectiveness in real-life LLM serving scenarios with diverse input prompts. Gain insights into information-theoretic impossibility results and stopping time regret bounds that establish the theoretical foundations of this adaptive speculative decoding method. The presentation by Professor Vincent Y. F. Tan from the National University of Singapore covers the mathematical formulation, algorithmic design, theoretical guarantees, and practical implementation of this novel approach to optimizing LLM inference throughput without requiring additional training or offline model alignment.

Syllabus

Time: 5:30 PM - 6:30 PM IST

Taught by

Centre for Networked Intelligence, IISc

Reviews

Start your review of BanditSpec - Adaptive Speculative Decoding via Bandit Algorithms

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.