SemiAnalysis InferenceMAX - Benchmarking the AI Frontier

Explore SemiAnalysis's open-source InferenceMAX benchmark suite in this 25-minute conference talk that comprehensively evaluates AI inference performance across major hardware platforms. Learn how this benchmarking framework sweeps the latest software optimizations across AMD, NVIDIA, and upcoming TPU-Trainium systems, measuring critical metrics including token throughput, energy consumption, cost efficiency, and interactivity. Discover how InferenceMAX covers frontier AI models with nightly, reproducible testing results that reveal real-world AI inference performance beyond marketing claims and industry hype. Understand the fundamental principle that AI computing represents a product of hardware and software co-design, where systems, kernels, and accelerators must evolve together to unlock true performance potential. Gain insights into how this benchmark suite provides transparent, data-driven analysis of how AI inference actually performs in practical deployment scenarios rather than theoretical benchmarks.