Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to optimize the utilization of scarce LLM accelerator resources in this 17-minute conference talk from SREcon25 Europe/Middle East/Africa. Discover strategies for maximizing the efficiency of accelerators used for serving Large Language Models, understanding that these resources are extremely limited both globally and within organizations. Explore practical approaches to demonstrate effective resource usage and justify continued access to these valuable computing assets. Gain insights from Google's experience in managing LLM infrastructure and learn why proving efficient utilization is critical for maintaining access to accelerator resources in competitive environments.
Syllabus
SREcon25 Europe/Middle East/Africa - Maximizing Utilization for LLM Accelerators
Taught by
USENIX