Waymo's EMMA - Teaching Cars to Think

Explore Waymo's groundbreaking End-to-End Multimodal Model for Autonomous Driving (EMMA) in this 17-minute conference talk by Research Scientist Jyh-Jing Hwang. Discover how multimodal large language models like Gemini are revolutionizing autonomous driving through unified end-to-end architectures that process raw sensor data directly into driving decisions. Learn about EMMA's state-of-the-art performance in trajectory planning, 3D object detection, and road graph understanding, while examining the Drive&Gen research approach to sensor simulation for evaluating end-to-end motion planning models. Gain insights into the benefits of co-training across multiple autonomous driving tasks and understand the potential of controlled video generation for testing under various environmental conditions. The presentation demonstrates how advanced AI techniques are being applied to solve complex real-world challenges in autonomous vehicle technology, showcasing the intersection of computer vision, machine learning, and practical automotive applications.