NVIDIA vGPU Support on Grace Blackwell Superchip - Architecture, Design, Upstreaming Status

Explore the architecture and virtualization capabilities of NVIDIA's Grace Blackwell Superchip in this 29-minute conference talk from KVM Forum. Discover how this ARM-based server platform delivers high-performance datacenter computing through its unified, cache-coherent memory subsystem that optimizes CPU-GPU interactions. Learn about the system's NVLINK-based chip-to-chip interconnect that enables coherent memory access between CPU and GPU, providing unified memory allocation control at the operating system level. Understand how GPU memory poison errors are managed through CPU firmware and how Address Translation Services (ATS) support creates a shared virtual address space between CPU and GPU. Examine how NVIDIA vGPU technology extends these advanced capabilities to virtualized environments, enabling multi-tenancy and efficient GPU resource sharing across multiple virtual machines. Delve into Multi-Instance Graphics (MIG) technology that partitions GPUs into secure instances for independent VM assignment, along with vSMMU support and PASID for process isolation in virtualized environments. Gain insights into the system architecture design, vGPU implementation for platform-specific features, and receive updates on ongoing upstreaming efforts for this cutting-edge technology.