How initial conditions affect generalization performance in large networks

1 March 1997

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 8 (2) , 448-451
https://doi.org/10.1109/72.557701

Abstract

Generalization is one of the most important problems in neural-network research. It is influenced by several factors in the network design, such as network size, weight decay factor, and others. We show here that the initial weight distribution (for gradient decent training algorithms) is one other factor that influences generalization. The initial conditions guide the training algorithm to search particular places of the weight space. For instance small initial weights tend to result in low complexity networks, and therefore can effectively act as a regularization factor. We propose a novel network complexity measure, which is helpful in shedding insight into the phenomenon, as well as in studying other aspects of generalization.

Keywords

This publication has 5 references indexed in Scilit:

Statistical Theory of Learning Curves under Entropic Loss Criterion
Neural Computation, 1993
Learning and convergence analysis of neural-type structured networks
IEEE Transactions on Neural Networks, 1992
Temporal Evolution of Generalization during Learning in Linear Networks
Neural Computation, 1991
A statistical approach to learning and generalization in layered neural networks
Proceedings of the IEEE, 1990
Learning from hints in neural networks
Journal of Complexity, 1990