Accelerating AI/ML Inference Workloads on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Finance Certifications Goldman Sachs & Amazon Teams Trust
Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn about the latest developments in AI/ML inference workloads on Kubernetes in this 32-minute conference talk from CNCF. Explore how the Kubernetes Working Group Serving (WG Serving) addresses the challenges posed by Generative AI through enhanced serving infrastructure solutions. Discover the group's initiatives focused on optimizing compute-intensive inference scenarios using specialized accelerators, which benefit web services and stateful databases. Gain detailed insights into WG Serving's workstreams and ongoing developments, while learning about opportunities for model server authors and practitioners to leverage Kubernetes capabilities for serving workloads. Understand how to contribute to the advancement of AI/ML inference on Kubernetes and participate in this evolving technological landscape.
Syllabus
WG Serving: Accelerating AI/ML Inference Workloads on Kubernetes - E.A. Gutierrez, Y. Tang
Taught by
CNCF [Cloud Native Computing Foundation]