The Fastest Way to Become a Backend Developer Online
Google, IBM & Microsoft Certificates — All in One Plan
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about GPREEMPT, a novel GPU preemptive scheduling mechanism that addresses the fundamental trade-off between generality and efficiency in GPU resource management. Discover how this research from Tsinghua University and Renmin University of China tackles the challenge of co-locating diverse workloads with different service level agreements (SLA) requirements on GPUs, including latency-critical and best-effort tasks. Explore the limitations of existing preemption strategies, including wait-based approaches that suffer from significant preemption latency and reset-based approaches that require kernel idempotence, thus limiting their applicability. Understand how GPREEMPT implements a timeslice-based yield mechanism to enable context-switch preemption on GPUs while maintaining broad generality. Examine the innovative hint-based pre-preemption technique that overlaps the preemption process with data preparation to minimize context-switching overhead. Analyze the evaluation results demonstrating GPREEMPT's ability to achieve low-latency preemption within 40 microseconds, comparable to executing only latency-critical tasks, while remaining applicable to non-idempotent workloads where traditional reset-based mechanisms fail.
Syllabus
USENIX ATC '25 - GPREEMPT: GPU Preemptive Scheduling Made General and Efficient
Taught by
USENIX