Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Deep Dive - CRI- RM Based CPU and NUMA Affinity to Achieve AI Task Acceleration

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a deep dive into CRI-RM based CPU and NUMA affinity for accelerating AI tasks in this conference talk. Learn how integrating CRI-RM components can enhance resource allocation within Kubernetes nodes, potentially improving AI task performance by over 50%. Discover the limitations of current CPU and NUMA features in Kubernetes and how CRI-RM addresses these issues. Gain insights into CPU-based AI task acceleration schemes, topology-aware resource alignment, and the advantages of using CRI-RM for customized development in both newer and older Kubernetes versions. Examine test cases using ResNet50 and CNN models to understand the practical applications and benefits of this approach in AI training clusters.

Syllabus

Intro
Content
CRI-RM architecture
Noisy neighbors
Latency critical workloaus
CPU clock speed throtung
Available resource and control
Topology aware policy
Static-pools policy
CRI-RM node agent
CRI-RM webhook
Topology-aware resource alignment
Some problems In Al training cluster (Kubernetes* + Docker*)
Run Al training tasks on the CPU
CPU management in Kubernetes
Kubernetes* integrated CRI-RM
Test environment
Test casel: restnet50+imagenet
Test case2: CNN+minst
Conclusion

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Deep Dive - CRI- RM Based CPU and NUMA Affinity to Achieve AI Task Acceleration

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.