Google, IBM & Meta Certificates — 40% Off for a Limited Time
The Private Equity Associate Certification
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about scaffolding, a new framework in TensorRT-LLM designed to support various inference-time compute methods including majority vote, best of N, and Monte Carlo Tree Search (MCTS), in this 56-minute technical presentation from Nvidia experts. Discover how to implement custom inference-time compute methods using the scaffolding framework and understand how it achieves an optimal balance between usability, modularity, and high performance. Explore community contributions and gain insights into extending TensorRT-LLM's capabilities for advanced inference-time computation scenarios.
Syllabus
Introduction of inference time compute support in TensorRT-LLM
Taught by
NVIDIA Developer