Effortlessly Build High-Performance AI/ML Processing Pipelines Within the ML Lifecycles

Learn to build high-performance AI/ML processing pipelines that efficiently chain accelerators together for optimal workload performance in this conference talk from KubeCon + CloudNativeCon. Discover how to move beyond traditional streaming processing where accelerators are assigned to specific tasks, and instead create service-level processing infrastructure by assigning each task to suitable accelerators and connecting them in sequence. Explore the limitations of native Kubernetes for deploying processing pipelines that chain accelerators, and see how Numaflow combined with Dynamic Resource Allocation (DRA) provides a solution. Watch a practical demonstration of building a video inference system using this processing pipeline approach, and understand how these pipelines can be integrated into MLOps workflows through Kubeflow Pipelines. Gain insights into future innovations including high-speed communication between accelerators via secondary network interface cards, presented by Kazuki Yamamoto from NTT at this 26-minute technical session focused on advancing cloud native AI/ML infrastructure.