Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Discover how to build a comprehensive self-service AI/ML infrastructure on Kubernetes that enables dynamic provisioning, policy enforcement, and full observability while allowing data scientists to focus on model development. Learn to extend ClusterAPI with essential tools including Backstage, OpenFeature, k0s, Prometheus, Sveltos, and k0rdent to create a scalable, secure, and automated AI/ML platform. Explore AI/ML cluster provisioning techniques using ClusterAPI and k0s, implement policy-driven multi-cluster automation strategies, and master controlled model rollouts through feature flags. Understand how to establish self-service ML environments via an Internal Developer Platform and implement comprehensive observability and performance monitoring at scale. Gain practical insights into adopting a Kubernetes-native approach that empowers platform engineering teams to manage the full lifecycle of AI/ML workloads effectively.
Syllabus
An AI/ML-Driven Approach to Platform Lifecycle Management - DevConf.US 2025
Taught by
DevConf