INC Manager for AI Fabrics - Orchestrating In-Network Collective Acceleration in AI Infrastructure
Open Compute Project via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
The Perfect Gift: Any Class, Never Expires
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about In-Network Collective (INC) Manager technology in this 22-minute conference talk that explores how to orchestrate and manage switch-based collective acceleration in AI fabric infrastructure. Discover how INC technology delivers significant performance benefits by offloading GPU collectives like all_reduce and reduce_scatter operations to fabric hardware capable of INC acceleration for reduction and multicast functions. Explore the central role of INC Manager as a system management component that communicates with INC_agent in SONiC NOS through control and management planes. Examine the proposed architecture and software interfaces on SONiC and SAI required for INC deployment, including strategies for deploying, managing, and debugging fabrics with INC capabilities. Gain insights into building the ecosystem necessary to support switch hardware with INC functionality and understand the technical requirements for implementing collective acceleration in modern AI data center environments.
Syllabus
INC Manager for AI Fabrics
Taught by
Open Compute Project