Exploiting superword level parallelism with multimedia instruction sets
- 1 May 2000
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGPLAN Notices
- Vol. 35 (5) , 145-156
- https://doi.org/10.1145/358438.349320
Abstract
Increasing focus on multimedia applications has prompted the addition of multimedia extensions to most existing general purpose microprocessors. This added functionality comes primarily with the addition of short SIMD instructions. Unfortunately, access to these instructions is limited to in-line assembly and library calls. Generally, it has been assumed that vector compilers provide the most promising means of exploiting multimedia instructions. Although vectorization technology is well understood, it is inherently complex and fragile. In addition, it is incapable of locating SIMD-style parallelism within a basic block. In this paper we introduce the concept of Superword Level Parallelism (SLP) ,a novel way of viewing parallelism in multimedia and scientific applications. We believe SLPP is fundamentally different from the loop level parallelism exploited by traditional vector processing, and therefore demands a new method of extracting it. We have developed a simple and robust compiler for detecting SLPP that targets basic blocks rather than loop nests. As with techniques designed to extract ILP, ours is able to exploit parallelism both across loop iterations and within basic blocks. The result is an algorithm that provides excellent performance in several application domains. In our experiments, dynamic instruction counts were reduced by 46%. Speedups ranged from 1.24 to 6.70.This publication has 12 references indexed in Scilit:
- Pointer analysis for multithreaded programsPublished by Association for Computing Machinery (ACM) ,1999
- How multimedia workloads will change processor designComputer, 1997
- Subword parallelism with MAX-2IEEE Micro, 1996
- VIS speeds new media processingIEEE Micro, 1996
- MMX technology extension to the Intel architectureIEEE Micro, 1996
- MicroUnity's MediaProcessor architectureIEEE Micro, 1996
- Evaluation of fortran vector compilers and preprocessorsSoftware: Practice and Experience, 1991
- Compiling Fortran 8x array features for the connection machine computer systemPublished by Association for Computing Machinery (ACM) ,1988
- Dependence graphs and compiler optimizationsPublished by Association for Computing Machinery (ACM) ,1981
- The ILLIAC IV ComputerIEEE Transactions on Computers, 1968