AI Product Expert Certification - Master Generative AI Skills
Future-Proof Your Career: AI Manager Masterclass
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn best practices for TensorRT-LLM performance analysis and optimization in this 54-minute technical presentation from Nvidia experts. Discover how to analyze TensorRT-LLM performance using specialized tools, interpret profiling results effectively, identify performance bottlenecks, and implement optimization strategies. Gain insights into the systematic approach for enhancing large language model inference performance through TensorRT-LLM's optimization capabilities, with practical guidance on utilizing profiling tools and understanding performance metrics to achieve better computational efficiency.
Syllabus
The practice of doing performance analysis/optimization with TensorRT-LLM
Taught by
NVIDIA Developer