Abstract
The paper describes a parallel architecture for universal digital signal processing. This architecture uses not only multiply-accumulate but also nonlinear operations, such as reciprocal, squareroot, exponential, sine/cosine, etc. Several advanced algorithms can thus be mapped to this array architecture. Specifically, the paper focuses attention on two very diverse algorithms, namely the fast Fourier transform and the matrix LU decomposition. Only two types of cells are used in the architecture; these are the Universal Multiply-Subtract-Add cell (UMSA) and the Universal Nonlinear cell (UNL). Both MA and nonlinear operations are performed in hardware, so that the operation times are on the order of chip-clock cycle times. The use of only two types of cells makes the architecture highly suitable for wafer scale integration. It is interesting to note that the same resources on the wafer are used for configuring it to either the FFT algorithm or the LU decomposition algorithm.

This publication has 8 references indexed in Scilit: