Scaling the AI Infrastructure to Data Center Regions
Open Compute Project via YouTube
-
47
-
- Write review
35% Off Finance Skills That Get You Hired - Code CFI35
Get 50% Off Udacity Nanodegrees — Code CC50
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to scale artificial intelligence infrastructure across data center regions in this 21-minute conference talk by Dan Rabinovitsj, VP of Data Center Infrastructure at Meta, presented at the Open Compute Project. Discover the strategic approaches and technical considerations required to expand AI computing capabilities beyond single data centers to regional deployments. Explore the challenges of distributed AI infrastructure, including network architecture, resource allocation, and coordination mechanisms needed to support large-scale machine learning workloads across multiple geographic locations. Gain insights into Meta's experience with building and managing AI infrastructure at scale, including best practices for maintaining performance, reliability, and efficiency when deploying AI systems across data center regions. Understand the implications of regional AI infrastructure scaling for latency optimization, data locality, fault tolerance, and operational complexity in modern cloud computing environments.
Syllabus
Scaling the AI Infrastructure to Data Center Regions
Taught by
Open Compute Project