Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

XuetangX

LLM 推理优化与部署实战

via XuetangX

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
深入学习大语言模型(LLM)推理优化与部署的核心技术,掌握从模型量化、KV Cache 优化、批处理策略到高性能推理框架的完整知识体系。探索 vLLM、TensorRT-LLM、DeepSpeed 等主流推理框架的原理与实践,了解如何通过 PagedAttention、连续批处理(Continuous Batching)、投机采样(Speculative Decoding)等先进技术大幅提升推理吞吐量并降低延迟。结合真实业务场景,学习如何在不同硬件环境下进行性能调优、资源调度与生产环境部署,帮助工程师和研究人员将大模型高效落地于实际应用中。

Reviews

Start your review of LLM 推理优化与部署实战

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.