Pass the PMP® Exam on Your First Try — Expert-Led Training
Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore a technical conference talk from Ray Summit 2024 where Huawei engineers Boyuan Chen, Chong Yin Tan, and Xiaoshuang Liu present their groundbreaking journey of integrating 10,000 Ascend NPUs into a Ray cluster. Discover the technical challenges and innovative solutions developed while migrating existing business cases to Ray and implementing Huawei Ascend NPU support. Learn about their custom full-stack Ray-observability engine designed for debugging and optimizing massive clusters, and understand the implementation of seamless NPU and GPU task scheduling within the same infrastructure. Gain valuable insights into strategies for maximizing resource utilization and maintaining stability in large-scale AI deployments, including the successful migration of a hyperscale inference pipeline to Ray. Perfect for organizations and engineers interested in scaling distributed computing and AI infrastructure to unprecedented levels.
Syllabus
Scaling Ray to 10K NPUs: Huawei's Hyperscale Journey | Ray Summit 2024
Taught by
Anyscale