Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

High-Performance LLM Serving on Intel - vLLM for XPU, HPU and CPU

Anyscale via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to deploy high-performance vLLM inference across Intel's complete hardware portfolio including GPUs (XPU), Gaudi Accelerators (HPU), and CPUs in this 26-minute conference talk from Ray Summit 2025. Discover Intel's latest advancements in bringing top-tier vLLM performance to diverse hardware backends through insights from Intel engineers Ding Ke and Chendi Xue. Explore comprehensive updates on vLLM enablement across Intel platforms, covering feature parity and performance with the new vLLM v1 architecture including KV connector, data parallelism, and multi-token prediction capabilities. Examine Intel-optimized model support including DeepSeek and GPT-OSS, and understand how the model ecosystem continues expanding. Gain insights into Intel's strategies for migrating vLLM capabilities from CUDA to non-CUDA environments while minimizing developer friction through API alignment with torch.cuda behavior. Review open-sourced kernels (Cutlass, Triton) that have been upstreamed into vLLM, BitsAndBytes, and other libraries. Understand Intel's future roadmap including upcoming optimizations and capabilities designed to enhance performance and developer experience across all Intel hardware platforms. Acquire practical knowledge for deploying performant LLM inference using vLLM on Intel platforms, learn from real-world migration challenges, and explore Intel's vision for creating a unified, developer-friendly AI ecosystem.

Syllabus

High-Performance LLM Serving on Intel: vLLM for XPU, HPU & CPU | Ray Summit 2025

Taught by

Anyscale

Reviews

Start your review of High-Performance LLM Serving on Intel - vLLM for XPU, HPU and CPU

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.