Intelligent Topology for AI Power - Network-Aware Scheduling Optimization With Volcano HyperNode
CNCF [Cloud Native Computing Foundation] via YouTube
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
The Most Addictive Python and SQL Courses
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore advanced network topology abstraction and scheduling optimization for AI workloads in this 26-minute conference talk from CNCF. Discover how the increasing complexity of AI models drives demand for sophisticated orchestration in high-performance AI clusters, particularly focusing on disaggregated-PD deployment patterns. Learn about the exploration within the CNCF community around Volcano and Kueue projects for network topology awareness and scheduling optimization. Examine three critical challenges in building AI infrastructure: creating common topology abstraction across diverse hardware platforms, efficiently managing topology data for optimal scheduling decisions, and providing ecosystem support for typical workload abstractions like LWS. Gain insights into how network-aware scheduling can enhance AI cluster performance and understand the latest developments in cloud native computing for AI workloads through practical examples and community-driven solutions.
Syllabus
Intelligent Topology for AI Power: Network-Aware Scheduling Optimization With Volcano... Kevin Wang
Taught by
CNCF [Cloud Native Computing Foundation]