Architecture and VLSI design of a VLSI neural signal processor

Abstract
A chip based on a scalable parallel systolic VLSI architecture has been designed for executing the compute-bound algorithmic primitives used by search and learning algorithms in neural networks and low level signal processing. The signal processor executes the algorithmic primitives and shared by all neural nets. The throughput is 800 million connection/s (1C = 16 bit) at 50 MHz. The chip contains 610 K transistors at 187 mm/sup 2/ in a 1.0-/spl mu/m CMOS technology. The I/O bandwidth for the weights is 3.2 Gbit/s and total data bandwidth is 10.9 Gbit/s. The processor is also useful for low-level signal preprocessing.

This publication has 3 references indexed in Scilit: