Gain a Splash of New Skills - Coursera+ Annual Just ₹7,999
Master Finance Tools - 35% Off CFI (Code CFI35)
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the current capabilities and limitations of automatic vectorization in modern C++ compilers through this 42-minute conference talk from Meeting C++ 2025. Discover how the performance landscape of modern CPUs has shifted dramatically over the past decade, with vector units (SIMD) now vastly outperforming scalar execution units within the same core, making technologies like SVE2 and AVX512 critical for optimal performance. Learn why failing to utilize these vector capabilities can cost an entire order of magnitude in performance, and understand the challenge that C++ faces with no built-in primitives for direct SIMD operations. Examine the two primary approaches available to developers: using target-specific libraries and extensions, or relying on compiler auto-vectorization for performance gains without code modifications. Analyze the enhanced auto-vectorization capabilities in the latest GCC and Clang releases, which are now enabled by default at higher optimization levels. Compare the effectiveness of auto-vectorizers across different code patterns, identifying where they excel and where they still face limitations. Evaluate the size and performance characteristics of compiler-generated vectorized code against hand-optimized implementations using intrinsics, with theoretical maximum hardware performance serving as the ultimate benchmark for comparison.
Syllabus
Speed for Free - current state of auto vectorizing compilers - Stefan Fuhrmann - Meeting C++ 2025
Taught by
Meeting Cpp