AI Kernel Generation - What's Working, What's Not, What's Next

Explore how AI-generated kernels can significantly accelerate custom PyTorch code without manual optimization effort in this 19-minute conference talk. Learn about the current landscape of PyTorch optimization frameworks like Triton and MLX, and discover why hand-written, low-level kernels still provide the strongest performance gains despite being tedious and time-consuming to develop across multiple platforms. Understand the potential of automating kernel generation using AI and examine best practices for implementing this approach, including effective testing and validation methods for AI-generated kernels. Discover what types of AI agents and contexts produce optimal results, and review research findings demonstrating how this methodology improved PyTorch inference performance specifically on Apple devices. Gain insights into the current state of AI kernel generation, identify existing limitations, and explore future directions for this emerging optimization technique.