Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Learn How to Run an LLM Inference Performance Benchmark on NVIDIA GPUs

DevConf via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to set up and execute comprehensive LLM inference performance benchmarks on NVIDIA GPUs using a complete open-source toolchain in this hands-on tutorial from DevConf.US 2025. Master the entire benchmarking pipeline starting with foundational GPU setup including RPM Fusion configuration, akmod-nvidia driver installation, and hardware validation through nvidia-smi. Configure containerized GPU access by implementing Podman 5.x with NVIDIA Container Toolkit's Container Device Interface for secure rootless operations. Deploy the lightweight vLLM inference engine with locally cached models from Hugging Face, establishing OpenAI-compatible HTTP endpoints for standardized API access. Utilize GuideLLM's automated load generation capabilities to systematically sweep request rates, capture detailed latency distributions, measure throughput ceilings, and collect comprehensive token-per-second statistics with structured JSON output for analysis. Gain practical troubleshooting expertise through live demonstrations that highlight common configuration pitfalls and provide actionable checklists applicable across Red Hat-derived distributions. Acquire transferable knowledge for scaling benchmarks to larger language models and multi-GPU configurations while understanding how architectural decisions impact measurement accuracy. Receive ready-to-use scripts, configuration templates, and resource links enabling immediate implementation regardless of prior experience with containers, CUDA programming, or performance benchmarking methodologies.

Syllabus

Learn How to Run an LLM Inference Performance Benchmark on NVIDIA GPUs - DevConf.US 2025

Taught by

DevConf

Reviews

Start your review of Learn How to Run an LLM Inference Performance Benchmark on NVIDIA GPUs

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.