Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

Memory-Efficient LLM Inference on Edge Devices With NNTrainer

Linux Foundation via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore how the NNTrainer open-source project achieves memory-efficient Large Language Model (LLM) inference on edge devices in this 26-minute conference talk. Discover how NNTrainer, originally optimized for training neural networks on memory-constrained devices, repurposes its battle-proven memory schedulers and memory-storage cooperation infrastructure to enable larger LLMs to run on smaller devices. Learn about NNTrainer's key technologies for minimizing memory footprint, including memory-efficient tensor scheduling that intelligently manages memory resources and a novel approach that leverages flash storage as auxiliary memory to extend capacity for larger models. See practical examples of running LLMs on devices using NNTrainer's optimization techniques. Understand how this Linux Foundation AI & Data project, currently part of the NNStreamer organization and under review to become an independent LFAI project, contributes to making advanced AI models more accessible on resource-limited hardware through innovative memory management solutions.

Syllabus

Memory-Efficient LLM Inference on Edge Devices With NNTrainer - Eunju Yang & Donghak Park

Taught by

Linux Foundation

Reviews

Start your review of Memory-Efficient LLM Inference on Edge Devices With NNTrainer

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.