Topology and Geometry of Single Hidden Layer Network, Least Squares Weight Solutions
- 1 July 1995
- journal article
- Published by MIT Press in Neural Computation
- Vol. 7 (4) , 672-705
- https://doi.org/10.1162/neco.1995.7.4.672
Abstract
In this paper the topological and geometric properties of the weight solutions for multilayer perceptron (MLP) networks under the MSE error criterion are characterized. The characterization is obtained by analyzing a homotopy from linear to nonlinear networks in which the hidden node function is slowly transformed from a linear to the final sigmoidal nonlinearity. Two different geometric perspectives for this optimization process are developed. The generic topology of the nonlinear MLP weight solutions is described and related to the geometric interpretations, error surfaces, and homotopy paths, both analytically and using carefully constructed examples. These results illustrate that although the natural homotopy provides a practically valuable heuristic for training, it suffers from a number of theoretical and practical difficulties. The linear system is a bifurcation point of the homotopy equations, and solution paths are therefore generically discontinuous. Bifurcations and infinite solutions further occur for data sets that are not of measure zero. These results weaken the guarantees on global convergence and exhaustive behavior normally associated with homotopy methods. However, the analyses presented provide a clear understanding of the relationship between linear and nonlinear perceptron networks, and thus a firm foundation for development of more powerful training methods. The geometric perspectives and generic topological results describing the nature of the solutions are further generally applicable to network analysis and algorithm evaluation.Keywords
This publication has 11 references indexed in Scilit:
- Global optimization of statistical functions with simulated annealingJournal of Econometrics, 1994
- On the Geometry of Feedforward Neural Network Error SurfacesNeural Computation, 1993
- Uniqueness of the weights for minimal feedforward nets with a given input-output mapNeural Networks, 1992
- A simple method to derive bounds on the size and to train multilayer neural networksIEEE Transactions on Neural Networks, 1991
- An Analysis of the Elastic Net Approach to the Traveling Salesman ProblemNeural Computation, 1989
- Neural networks and principal component analysis: Learning from examples without local minimaNeural Networks, 1989
- Algorithm 652ACM Transactions on Mathematical Software, 1987
- The homotopy continuation method: numerically implementable topological proceduresTransactions of the American Mathematical Society, 1978
- The Relationship Between Variable Selection and Data Agumentation and a Method for PredictionTechnometrics, 1974
- The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables SeparateSIAM Journal on Numerical Analysis, 1973