Introducing Control Flow into Vectorized Code
- 1 September 2007
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 1089795X,p. 280-291
- https://doi.org/10.1109/pact.2007.4336219
Abstract
Single instruction multiple data (SIMD) functional units are ubiquitous in modern microprocessors. Effective use of these SIMD functional units is essential in achieving the highest possible performance. Automatic generation of SIMD instructions in the presence of control flow is challenging, however, not only because SIMD code is hard to generate in the presence of arbitrarily complex control flow, but also because the SIMD code executing the instructions in all control paths may slow compared to the scalar original, which may bypass a large portion of the code. One promising technique introduced recently involves inserting branches-on-superword-condition-codes (BOSCCs) to bypass vector instructions. In this paper, we describe two techniques that improve on the previous approach. First, BOSCCs are generated in a nested fashion so that even BOSCCs themselves can be bypassed by other BOSCCs. Second, we generate all vec_any_* instructions to bypass even some predicate-defining instructions. We implemented these techniques in a vectorizing compiler. On 14 kernels, the compiler achieves distinct speedups, including 1.99X over the previous technique that generates single- level BOSCCs and vec_any_ne only.Keywords
This publication has 13 references indexed in Scilit:
- Evaluating compiler technology for control-flow optimizations for multimedia extension architecturesMicroprocessors and Microsystems, 2009
- Superword-Level Parallelism in the Presence of Control FlowPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Compiler-controlled caching in superword register files for multimedia extension architecturesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Loop distribution with arbitrary control flowPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Code selection for media processors with SIMD instructionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Exploiting superword level parallelism with multimedia instruction setsPublished by Association for Computing Machinery (ACM) ,2000
- A Vectorizing Compiler for Multimedia ExtensionsInternational Journal of Parallel Programming, 2000
- Compilation Techniques for Multimedia ProcessorsInternational Journal of Parallel Programming, 2000
- Intel MMX for multimedia PCsCommunications of the ACM, 1997
- VIS speeds new media processingIEEE Micro, 1996