Our career paths help you become job ready faster
Start speaking a new language. It’s just 3 weeks away.
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Watch a technical presentation from ByteDance and Broadcom architects exploring how Scheduled Ethernet Fabric technology optimizes large-scale AI training clusters. Discover the architecture behind connecting tens of thousands of GPUs efficiently, with detailed insights into achieving extensive GPU scale-out, managing diverse parallel workloads through multi-tenancy, and implementing resilient networking against failures. Learn from ByteDance's real-world benchmarking results and deployment experiences with this fabric technology, while gaining perspective on the importance of open ecosystems for continued innovation in AI infrastructure. Understand key requirements for high-performance network fabrics that maximize computational power across massive GPU clusters handling various AI workloads.
Syllabus
Scheduled Ethernet Fabric for Large scale AI training cluster
Taught by
Open Compute Project