Manage Cloud Native LLM Workloads Across Edge and Cloud Seamlessly Using KubeEdge and WasmEdge
CNCF [Cloud Native Computing Foundation] via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk explores how to deploy Large Language Models (LLMs) beyond data centers to edge devices using KubeEdge and WasmEdge integration. Learn how this powerful combination addresses key challenges in edge AI deployment, including maintaining accuracy with limited resources and simplifying cross-device deployment. Discover how WasmEdge provides a lightweight, portable runtime under 50MB with no external dependencies, while KubeEdge Sedna orchestrates edge-cloud collaboration by monitoring inference accuracy and automatically routing requests to cloud-based models when needed. See a demonstration of how small LLMs deliver quick local inference at the edge, with seamless transition to larger cloud models when higher accuracy is required. The presenters showcase how inference workloads built in Rust and compiled to WebAssembly can be deployed across edge and cloud environments without modifications. This solution has been successfully implemented in production across multiple industries including aerospace and banking.
Syllabus
Manage Cloud Native LLM Workloads Across Edge and Cloud Seamlessly Using KubeE... Vivian Hu & Fei Xu
Taught by
CNCF [Cloud Native Computing Foundation]