Abstract
Many traditional algorithms for computing the fast Fourier transform (FFT) on conventional computers are unacceptable for advanced vector and parallel computers because they involve nonunit, power-of-two memory strides. This paper presents a practical technique for computing the FFT that avoids all such strides and ap pears to be near-optimal for a variety of current vector and parallel computers. Performance results of a pro gram based on this technique are presented. Notable among these results is that a Fortran implementation of this algorithm on the CRAY-2 runs up to 77% faster than Cray's assembly-coded library routine.

This publication has 5 references indexed in Scilit: