Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

The New PyTorch Architecture for TensorRT-LLM

Nvidia via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore a 52-minute talk from Nvidia that introduces the new PyTorch-based architecture for TensorRT-LLM, designed to enhance user experience and developer velocity for large language model (LLM) deployments. Learn how this architecture makes it easier to build custom models, integrate new kernels, and extend runtime functionality while delivering state-of-the-art performance on NVIDIA GPUs. Through concrete examples, discover the flexibility of this PyTorch-based architecture and how it enables quick customizations while maintaining optimal performance for LLM deployments on the NVIDIA platform.

Syllabus

Beyond the Algorithm with NVIDIA: The New PyTorch Architecture for TensorRT-LLM

Taught by

NVIDIA Developer

Reviews

Start your review of The New PyTorch Architecture for TensorRT-LLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.