Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Introducing AIBrix - Cost-Effective and Scalable Kubernetes Control Plane for VLLM

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about AIBrix, a Kubernetes-native control plane designed specifically for managing large-scale LLM inference workloads, presented by engineers from ByteDance at a CNCF conference talk. Discover how this innovative solution addresses the complexities of scaling LLM inference beyond what traditional high-performance engines like vLLM can provide alone. Explore AIBrix's pluggable architecture featuring specialized components for LLM-specific autoscaling, high-density LoRA management, distributed KV cache, heterogeneous serving, and efficient model loading. Understand the deep co-design philosophy that enables advanced optimizations through tight integration with inference engines. Examine detailed benchmarks and performance evaluations that demonstrate AIBrix's ability to improve scalability and optimize resource utilization in production environments. Gain actionable insights for implementing cost-effective and scalable Kubernetes control planes for your own large language model inference workloads.

Syllabus

Introducing AIBrix: Cost-Effective and Scalable Kubernetes Control Plan... Jiaxin Shan & Liguang Xie

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Introducing AIBrix - Cost-Effective and Scalable Kubernetes Control Plane for VLLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.