Characterizing Communication Patterns in Distributed LLM Inference
HOTI - Hot Interconnects Symposium via YouTube
Free courses from frontend to fullstack and AI
AI Engineer - Learn how to integrate AI into software applications
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the communication patterns and performance characteristics of distributed large language model inference in this 33-minute conference talk from the Hot Interconnects Symposium. Examine how LLM inference workloads distribute across multiple nodes and analyze the network communication requirements, bottlenecks, and optimization opportunities in distributed inference scenarios. Learn about the research findings on communication overhead, data movement patterns, and interconnect utilization when running large language models across distributed computing environments. Discover insights into how different distributed inference strategies impact network performance and understand the implications for designing efficient high-performance computing systems for LLM deployment at scale.
Syllabus
Characterizing Communication Patterns in Distributed LLM Inference - Lang Xu
Taught by
HOTI - Hot Interconnects Symposium