Optimizing LLM Efficiency One Trace at a Time on Kubernetes
CNCF [Cloud Native Computing Foundation] via YouTube
Learn AI, Data Science & Business — Earn Certificates That Get You Hired
Free courses from frontend to fullstack and AI
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how to optimize Large Language Model (LLM) deployments on Kubernetes through a 25-minute conference talk from CNCF experts. Discover techniques for using OpenTelemetry's profiling capabilities to identify resource-intensive code segments, detect memory leaks, and prevent out-of-memory errors in LLM applications. Master the art of dynamic runtime inspection to improve model performance, reduce latency, and meet service level agreements. Gain practical insights into achieving efficient Kubernetes deployments while optimizing resource utilization and controlling costs. Explore methods for deep-level code analysis that enable precise identification of performance bottlenecks and resource drains in LLM implementations.
Syllabus
Optimizing LLM Efficiency One Trace at a Time on Kubernetes - Aditya Soni, Forrester & Seema Saharan
Taught by
CNCF [Cloud Native Computing Foundation]