Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

The Practice of Doing Performance Analysis and Optimization with TensorRT-LLM

Nvidia via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn best practices for TensorRT-LLM performance analysis and optimization in this 54-minute technical presentation from Nvidia experts. Discover how to analyze TensorRT-LLM performance using specialized tools, interpret profiling results effectively, identify performance bottlenecks, and implement optimization strategies. Gain insights into the systematic approach for enhancing large language model inference performance through TensorRT-LLM's optimization capabilities, with practical guidance on utilizing profiling tools and understanding performance metrics to achieve better computational efficiency.

Syllabus

The practice of doing performance analysis/optimization with TensorRT-LLM

Taught by

NVIDIA Developer

Reviews

Start your review of The Practice of Doing Performance Analysis and Optimization with TensorRT-LLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.