Systolic super summation

Abstract
A principal limitation in accuracy for scientific computation performed with floating-point arithmetic is due to the computation of repeated sums, such as those that arise in inner products. A systolic super summer of cellular design is proposed for the high-throughput performance of repeated sums of floating-point numbers. The apparatus receives pipelined inputs of streams of summands from one or many sources. The floating-point summands are converted into a fixed-point form by a sieve-like pipelined cellular packet-switching device with signal combining. The emerging fixed-point numbers are then summed in a corresponding network of extremely long accumulators (i.e., super accumulators). At the cell level, the design uses a synchronous model of VLSI. The amount of time the apparatus needs to compute an entire sum depends on the values of summands; at this architectural level, the design is asynchronous. The throughput per unit area of hardware approaches that of a tree network, but without the long wire and signal propagation delay that are intrinsic to tree networks.

This publication has 6 references indexed in Scilit: