Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
Learn Backend Development Part-Time, Online
Overview
AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Learn how unstructured sparsity intersects with tensor core architectures through practical insights from sparse attention mechanisms and Mixture of Experts (MoE) models in this 37-minute conference talk. Explore the challenges and opportunities that arise when implementing sparse computational patterns on modern GPU tensor cores, examining real-world case studies from attention mechanisms and MoE architectures. Discover optimization strategies for managing parallelism in sparse workloads, understand the performance implications of different sparsity patterns on tensor hardware, and gain insights into the trade-offs between computational efficiency and memory access patterns. Examine how unstructured sparsity can be effectively leveraged in deep learning applications while working within the constraints and capabilities of specialized tensor processing units.
Syllabus
Unstructured Sparsity Meets Tensor Cores: Lessons from Sparse Attention and MoE
Taught by
Simons Institute