From 0 To 500+ Models - Building a Robust Open Source AI Compiler With MLIR

Explore the engineering journey of building an MLIR-based compiler stack that supports over 500 production AI models for inference in this 39-minute conference talk from the Linux Foundation. Discover how Tenstorrent's team developed a flexible, modular open-source compiler architecture that integrates multiple frontends including PyTorch, ONNX, and PaddlePaddle. Learn about the practical implementation of custom passes for model optimization, visualization and debugging tools like TTRT and Model Explorer for intermediate representations, and manual fine-tuning techniques that enhance performance beyond default compiler capabilities. Gain insights into the engineering challenges, tooling solutions, and debugging strategies that enabled scaling from zero to hundreds of production models, plus understand how the team is contributing back to the open-source community through bounty programs and developer-friendly tooling.