Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Meta's large-scale deployment of liquid-cooled GB200 hardware in this 16-minute conference talk from the Open Compute Project. Discover the technical challenges encountered when implementing liquid cooling systems for AI hardware at unprecedented scale, including performance requirements, quality standards, and productivity demands that surpass previous liquid-cooled deployments. Explore the comprehensive approach to designing technical requirements and workflow processes that ensure hardware meets performance and quality expectations before datacenter installation. Examine critical considerations for transportation logistics, coolant health maintenance protocols, and field-based leakage handling procedures. Understand how Meta's dedicated engineering efforts enabled the delivery of hundreds of GB200 racks weekly while maintaining liquid-based issues below statistical significance thresholds.
Syllabus
Enablement of Liquid Cooled GB200 At Scale
Taught by
Open Compute Project