Characterizing Communication Patterns in Distributed LLM Inference
HOTI - Hot Interconnects Symposium via YouTube
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Build AI Apps with Azure, Copilot, and Generative AI — Microsoft Certified
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the communication patterns and performance characteristics of distributed large language model inference in this 33-minute conference talk from the Hot Interconnects Symposium. Examine how LLM inference workloads distribute across multiple nodes and analyze the network communication requirements, bottlenecks, and optimization opportunities in distributed inference scenarios. Learn about the research findings on communication overhead, data movement patterns, and interconnect utilization when running large language models across distributed computing environments. Discover insights into how different distributed inference strategies impact network performance and understand the implications for designing efficient high-performance computing systems for LLM deployment at scale.
Syllabus
Characterizing Communication Patterns in Distributed LLM Inference - Lang Xu
Taught by
HOTI - Hot Interconnects Symposium