Future-Proof Your Career: AI Manager Masterclass
Gain a Splash of New Skills - Coursera+ Annual Just ₹7,999
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how unstructured sparsity intersects with tensor core architectures through practical insights from sparse attention mechanisms and Mixture of Experts (MoE) models in this 37-minute conference talk. Explore the challenges and opportunities that arise when implementing sparse computational patterns on modern GPU tensor cores, examining real-world case studies from attention mechanisms and MoE architectures. Discover optimization strategies for managing parallelism in sparse workloads, understand the performance implications of different sparsity patterns on tensor hardware, and gain insights into the trade-offs between computational efficiency and memory access patterns. Examine how unstructured sparsity can be effectively leveraged in deep learning applications while working within the constraints and capabilities of specialized tensor processing units.
Syllabus
Unstructured Sparsity Meets Tensor Cores: Lessons from Sparse Attention and MoE
Taught by
Simons Institute