AI Engineer - Learn how to integrate AI into software applications
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn best practices for TensorRT-LLM performance analysis and optimization in this 54-minute technical presentation from Nvidia experts. Discover how to analyze TensorRT-LLM performance using specialized tools, interpret profiling results effectively, identify performance bottlenecks, and implement optimization strategies. Gain insights into the systematic approach for enhancing large language model inference performance through TensorRT-LLM's optimization capabilities, with practical guidance on utilizing profiling tools and understanding performance metrics to achieve better computational efficiency.
Syllabus
The practice of doing performance analysis/optimization with TensorRT-LLM
Taught by
NVIDIA Developer