Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn about Software Pipelining implementation in LLVM for AArch64 architecture in this technical conference talk that explores loop optimization techniques for increasing instruction-level parallelism. Discover how the MachinePipeliner pass has been developed with basic support for AArch64, demonstrating significant performance improvements on Neoverse V1 processors, particularly for loops with long dependency chains. Examine real performance results that show the effectiveness of Software Pipelining even in modern out-of-order processors. Explore the current state of implementation with merged patches in the main branch, while understanding future improvement opportunities including the development of user interfaces through pragma directives. Gain detailed insights into the development process and performance enhancements achieved through this optimization technique.