Learn Python with Generative AI - Self Paced Online
MIT Sloan: Lead AI Adoption Across Your Organization — Not Just Pilot It
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn how unstructured sparsity intersects with tensor core architectures through practical insights from sparse attention mechanisms and Mixture of Experts (MoE) models in this 37-minute conference talk. Explore the challenges and opportunities that arise when implementing sparse computational patterns on modern GPU tensor cores, examining real-world case studies from attention mechanisms and MoE architectures. Discover optimization strategies for managing parallelism in sparse workloads, understand the performance implications of different sparsity patterns on tensor hardware, and gain insights into the trade-offs between computational efficiency and memory access patterns. Examine how unstructured sparsity can be effectively leveraged in deep learning applications while working within the constraints and capabilities of specialized tensor processing units.
Syllabus
Unstructured Sparsity Meets Tensor Cores: Lessons from Sparse Attention and MoE
Taught by
Simons Institute