Perfect loss of generalization due to noise in K=2 parity machines

21 March 1994

journal article
Published by IOP Publishing in Journal of Physics A: General Physics

Vol. 27 (6) , 1917-1927
https://doi.org/10.1088/0305-4470/27/6/017

Abstract

Learning in a specific type of multilayer network referred to as a K=2 parity machine is studied in the limit that both the system size N and the number of examples m become infinite while the ratio alpha =m/N remains finite. The machine consists of K=2 hidden units with non-overlapping receptive fields each of size N/2. The output is the sign of the product of the two hidden units for each input We investigate incremental learning by empirically using a least-action algorithm in the following two learning paradigms. In the first, it is assumed that each example is transmitted perfectly to a student. We show that an ability to generalize emerges as the rescaled length of the connection vector l reaches a critical value l_c. Further, we show that a student can identify the target exactly in the limit alpha to infinity , where the prediction error epsilon decreases to zero as epsilon approximately 0.441 alpha ^-13/. In the second paradigm, we examine what happens if each teacher signal is reversed to the opposite sign at a noise rate l. For small lambda , it is found that the prediction error converges to a finite value of O( square root lambda ) in O( lambda ^-32/) iterations. However, for a noise rate beyond a critical value lambda _c approximately 0.175, the student cannot acquire any generalization ability even as alpha to infinity .

Keywords

This publication has 7 references indexed in Scilit:

Memorization Without Generalization in a Multilayered Neural Network
Europhysics Letters, 1992
Learning Curves for Error Minimum and Maximum Likelihood Algorithms
Neural Computation, 1992
Four Types of Learning Curves
Neural Computation, 1992
Statistical mechanics of learning from examples
Physical Review A, 1992
The Perceptron Algorithm is Fast for Nonmalicious Distributions
Neural Computation, 1990
Bounds on the learning capacity of some multi-layer networks
Biological Cybernetics, 1989
A new polynomial-time algorithm for linear programming
Combinatorica, 1984