A Practical Guide to Benchmarking AI and GPU Workloads in Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Power BI Fundamentals - Create visualizations and dashboards from scratch
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk provides a practical guide on benchmarking AI and GPU workloads in Kubernetes environments. Learn how to optimize GPU resource efficiency and enhance performance for AI workloads through effective benchmarking techniques. Discover how to set up, configure, and run various GPU and AI workload benchmarks in Kubernetes, covering a range of use cases including model serving, model training, and GPU stress testing. Explore tools like NVIDIA Triton Inference Server, fmperf for benchmarking LLM serving performance, MLPerf for comparing machine learning systems performance, and utilities such as GPUStressTest, gpu-burn, and cuda benchmark. Gain insights into GPU monitoring and load generation tools through step-by-step demonstrations. Develop practical skills for running benchmarks on GPUs in Kubernetes and leverage existing tools to fine-tune and optimize GPU resource and workload management for improved performance and resource efficiency.
Syllabus
A Practical Guide To Benchmarking AI and GPU Workloads in Kubernetes - Yuan Chen & Chen Wang
Taught by
CNCF [Cloud Native Computing Foundation]