Permutations on Illiac IV-Type Networks

Abstract
Performing permutations of data on SIMD computers efficiently is important for high-speed execution of parallel algorithms. In this correspondence we consider realizing permutations such as perfect shuffle, matrix transpose, bit-reversal, the class of bit-permute- complement (BPC), the class of Omega, and inverse Omega permutations on N = 2n processors with Illiac IV-type interconnection network, where each processor is connected to processors at distances of ± 1 and ± N. The minimum number of data transfer operations required for realizing any of these permutations on such a network is shown to be 2(N − 1). We provide a general three-phase strategy for realizing permutations and derive routing algorithms for performing perfect shuffle, Omega, Inverse Omega, bit reversal, and matrix-transpose permutations in 2(N − 1) steps. Our approach is quite simple, and unlike previous approaches, makes efficient use of the topology of the Illiac IV-type network to realize these permutations using the optimum number of data transfers. Our strategy is quite powerful: any permutation can be realized using this strategy in 3(N − 1) steps.