Learning in Artificial Neural Networks: A Statistical Perspective

1 December 1989

journal article
Published by MIT Press in Neural Computation

Vol. 1 (4) , 425-464
https://doi.org/10.1162/neco.1989.1.4.425

Abstract

The premise of this article is that learning procedures used to train artificial neural networks are inherently statistical techniques. It follows that statistical theory can provide considerable insight into the properties, advantages, and disadvantages of different network learning methods. We review concepts and analytical results from the literatures of mathematical statistics, econometrics, systems identification, and optimization theory relevant to the analysis of learning in artificial neural networks. Because of the considerable variety of available learning procedures and necessary limitations of space, we cannot provide a comprehensive treatment. Our focus is primarily on learning procedures for feedforward networks. However, many of the concepts and issues arising in this framework are also quite broadly relevant to other network learning paradigms. In addition to providing useful insights, the material reviewed here suggests some potentially useful new training methods for artificial neural networks.

Keywords

This publication has 26 references indexed in Scilit:

Approximation by superpositions of a sigmoidal function
Mathematics of Control, Signals, and Systems, 1989
What Size Net Gives Valid Generalization?
Neural Computation, 1989
On the approximate realization of continuous mappings by neural networks
Neural Networks, 1989
A unified framework for connectionist systems
Biological Cybernetics, 1988
Semi-Nonparametric Maximum Likelihood Estimation
Econometrica, 1987
Thermodynamical approach to the traveling salesman problem: An efficient simulation algorithm
Journal of Optimization Theory and Applications, 1985
Misspecified models with dependent observations
Journal of Econometrics, 1982
Nonparametric Maximum Likelihood Estimation by the Method of Sieves
The Annals of Statistics, 1982
Hypothesis testing when a nuisance parameter is present only under the alternative
Biometrika, 1977
Multidimensional Stochastic Approximation Methods
The Annals of Mathematical Statistics, 1954