Computational capabilities of restricted two-layered perceptrons

Abstract
We study the extent to which fixing the second-layer weights reduces the capacity and generalization ability of a two-layer perceptron. Architectures with N inputs, K hidden units, and a single output are considered, with both overlapping and nonoverlapping receptive fields. We obtain from simulations one measure of the strength of a network—its critical capacity, αc. Using the ansatz τmed∝(αc)2 to describe the manner in which the median learning time diverges as αc is approached, we estimate αc in a manner that does not depend on arbitrary impatience parameters. The c h i r learning algorithm is used in our simulations. For K=3 and overlapping receptive fields we show that the general machine is equivalent to the committee machine with the same architecture. For K=5 and the same connectivity the general machine is the union of four distinct networks with fixed second layer weights, of which the committee machine is the one with the highest αc. Since the capacity of the union of a finite set of machines equals that of the strongest constituent, the capacity of the general machine with K=5 equals that of the committee machine. We were not able to prove this for general K , but believe that it does hold. We investigated the internal representations used by different machines, and found that high correlations between the hidden units and the output reduce the capacity. Finally we studied the Boolean functions that can be realized by networks with fixed second layer weights. We discovered that two different machines implement two completely distinct sets of Boolean functions.