Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Introduction to Disaggregated Serving in TensorRT-LLM

Nvidia via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about disaggregated serving in TensorRT-LLM through this 37-minute technical presentation from Nvidia experts. Discover the potential benefits of disaggregated serving architecture and gain practical knowledge on implementing disaggregated serving with TensorRT-LLM. Explore current performance metrics and benchmarks for popular large language models when using this serving approach, understanding how this technique can optimize resource utilization and scalability in production environments.

Syllabus

Introduction of disaggregated serving in TensorRT-LLM

Taught by

NVIDIA Developer

Reviews

Start your review of Introduction to Disaggregated Serving in TensorRT-LLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.