Accelerating AI/ML Inference Workloads on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Foundations for Product Management Success
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about the latest developments in AI/ML inference workloads on Kubernetes in this 32-minute conference talk from CNCF. Explore how the Kubernetes Working Group Serving (WG Serving) addresses the challenges posed by Generative AI through enhanced serving infrastructure solutions. Discover the group's initiatives focused on optimizing compute-intensive inference scenarios using specialized accelerators, which benefit web services and stateful databases. Gain detailed insights into WG Serving's workstreams and ongoing developments, while learning about opportunities for model server authors and practitioners to leverage Kubernetes capabilities for serving workloads. Understand how to contribute to the advancement of AI/ML inference on Kubernetes and participate in this evolving technological landscape.
Syllabus
WG Serving: Accelerating AI/ML Inference Workloads on Kubernetes - E.A. Gutierrez, Y. Tang
Taught by
CNCF [Cloud Native Computing Foundation]