Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Standardizing GPU Management - Redfish, Telemetry, and Firmware Update Protocols

Open Compute Project via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the collaborative efforts of major tech companies to standardize GPU management protocols in this 28-minute panel discussion from the Open Compute Project. Learn about the significant progress made by AMD, Google, Meta, Microsoft, and NVIDIA in developing scalable management and observability solutions for AI and high-performance computing workloads. Discover the successful publication of a DMTF Message Registry and Redfish Interoperability Profile for GPU management, along with recent work to standardize GPU telemetry interfaces for accessing time-series data, detailed crash dumps, and debug logs. Understand how these standardization efforts focus on supporting low-latency and time-sensitive data streams with service level objectives (SLOs) to improve integration and testability. Gain insights into how this foundational work enables interoperability and provides hyperscalers with consistent management capabilities across multi-vendor GPU deployments while reducing fragmented requests to GPU suppliers.

Syllabus

Panel Standardizing GPU Management Redfish, Telemetry, and Firmware Update Protocols

Taught by

Open Compute Project

Reviews

Start your review of Standardizing GPU Management - Redfish, Telemetry, and Firmware Update Protocols

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.