AssyLLM - Efficient Federated Fine-tuning of LLMs via Assembling Pre-trained Blocks

Learn about AssyLLM, an innovative federated learning framework that enables memory-efficient fine-tuning of Large Language Models on edge devices while preserving data privacy. Discover how this 21-minute conference presentation from USENIX ATC '25 introduces a novel approach that decomposes pre-trained LLMs into discrete transformer blocks, which are then iteratively selected and assembled based on local data distributed across various devices to create tailored models for downstream tasks. Explore the four core components of AssyLLM: Block Comparator for assessing block compatibility, Elastic Adapter for creating customized configurations to address structural differences, Block Quanter for adjusting weight precision to reduce memory overhead, and Block Swapper for optimizing the swapping pipeline using block correlation metrics. Understand how this framework avoids traditional backpropagation processes to achieve high fine-tuning efficiency, delivering up to 18.26% accuracy improvement, 30.04x speedup, and 92% reduction in memory consumption compared to conventional methods, as demonstrated through comprehensive evaluation on multiple benchmark datasets of varying complexity.