Abstract
An algorithm is presented for a more efficient and implementable solution of triangular systems on a parallel (SIMD) computer which requires 0(log (N)) fewer processing cycles than the best previous results, where N is the system size. We will also show that the data can be accessed and aligned in the same order of time using as many memory units as processors and Ω networks for data alignment. (Previous results dealing with this type of algorithm have not dealt in any detail with the problem of data access and alignment.)

This publication has 3 references indexed in Scilit: