Accelerating AI/ML Inference Workloads on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
AI, Data Science & Cloud Certificates from Google, IBM & Meta
Get 20% off all career paths from fullstack to AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn about the latest developments in AI/ML inference workloads on Kubernetes in this 32-minute conference talk from CNCF. Explore how the Kubernetes Working Group Serving (WG Serving) addresses the challenges posed by Generative AI through enhanced serving infrastructure solutions. Discover the group's initiatives focused on optimizing compute-intensive inference scenarios using specialized accelerators, which benefit web services and stateful databases. Gain detailed insights into WG Serving's workstreams and ongoing developments, while learning about opportunities for model server authors and practitioners to leverage Kubernetes capabilities for serving workloads. Understand how to contribute to the advancement of AI/ML inference on Kubernetes and participate in this evolving technological landscape.
Syllabus
WG Serving: Accelerating AI/ML Inference Workloads on Kubernetes - E.A. Gutierrez, Y. Tang
Taught by
CNCF [Cloud Native Computing Foundation]