35% Off Finance Skills That Get You Hired - Code CFI35
Master AI & Data—50% Off Udacity (Code CC50)
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to serve large language models at scale using llm-d, a Kubernetes-native solution for distributed inference that supports any model across diverse hardware accelerators, presented by Robert Shaw in this 33-minute conference talk from DevConf.US 2025.
Syllabus
llm-d: Kubernetes Native Distributed Inferencing - DevConf.US 2025
Taught by
DevConf