Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling Generative AI Inference with llm-d

DevConf via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to deploy and serve large generative AI models in production at scale using llm-d, an open-source, Kubernetes-native distributed inference serving stack. Explore the significant challenges of deploying large generative AI models in production environments and discover how llm-d provides streamlined solutions for developers. Understand llm-d's architecture and key features that enable fast time-to-value and competitive performance across diverse hardware accelerators. Gain practical knowledge about leveraging tested and benchmarked recipes for production deployments, with a focus on real-world applications and industry best practices for scaling generative AI inference systems.

Syllabus

Scaling Generative AI Inference with llm-d - DevConf.IN 2026

Taught by

DevConf

Reviews

Start your review of Scaling Generative AI Inference with llm-d

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.