The Fast and the Curious - Building Scalable AI Platforms with Kubernetes and OpenStack

Learn to build scalable AI platforms using Kubernetes and OpenStack infrastructure in this conference talk that addresses the practical challenges of running AI workloads at scale on private cloud infrastructure. Discover how to provision GPU-enabled clusters, manage model serving, and monitor distributed AI workloads through real-world implementation experiences. Explore the integration of open source tooling including k0rdent and Kubernetes-native components such as KServe, Knative, Prometheus, and Grafana to automate the MLOps lifecycle on OpenStack infrastructure. Gain insights into overcoming the challenges of setting up scalable, automated, and cost-efficient infrastructure for demanding AI/ML pipelines, moving beyond public cloud limitations to leverage private cloud capabilities. Understand the practical aspects of building infrastructure for AI applications where organizations maintain full control over their technology stack, with specific focus on GPU infrastructure management and distributed workload optimization.