CLONE - Customizing LLMs for Efficient Latency-Aware Inference at the Edge

USENIX via YouTube Direct link

USENIX ATC '25 - CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the Edge

1

of 1

1 of 1

USENIX ATC '25 - CLONE: Customizing LLMs for Efficient Latency-Aware Inference at the Edge

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

CLONE - Customizing LLMs for Efficient Latency-Aware Inference at the Edge