Master Agentic AI, GANs, Fine-Tuning & LLM Apps
Launch a New Career with Certificates from Google, IBM & Microsoft
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about a novel transcompiler that automatically translates tensor programs across heterogeneous deep learning systems in this 11-minute conference talk from OSDI '25. Discover how QiMeng-Xpiler addresses the challenge of developing multiple low-level tensor programs for different platforms like GPUs and ASICs by combining large language models (LLMs) with symbolic program synthesis in a neural-symbolic approach. Explore the key insight of leveraging LLM's code generation capabilities to make search-based symbolic synthesis computationally tractable, including multiple LLM-assisted compilation passes using pre-defined meta-prompts for program transformation. Understand how efficient symbolic program synthesis repairs incorrect code snippets at limited scale, and examine the hierarchical auto-tuning approach that systematically explores parameters and transformation pass sequences for high performance. Review experimental results demonstrating 95% average accuracy in correctly translating tensor programs across four distinct deep learning systems: Intel DL Boost with VNNI, NVIDIA GPU with CUDA, AMD MI with HIP, and Cambricon MLU with BANG, making "Write Once, Run Anywhere" for tensor programs a practical reality.
Syllabus
OSDI '25 - QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a...
Taught by
USENIX