Abstract
It is demonstrated both theoretically and experimentally that, under appropriate assumptions, a neural network pattern classifier implemented with a supervised learning algorithm generates the empirical Bayes rule that is optimal against the empirical distribution of the training sample. It is also shown that, for a sufficiently large sample size, asymptotic equivalence of the network-generated rule to the theoretical Bayes optimal rule against the true distribution governing the occurrence of data follows immediately from the law of large numbers. It is proposed that a Bayes statistical decision approach leads naturally to a probabilistic definition of the valid generalization which a neural network can be expected to generate from a finite training sample.

This publication has 8 references indexed in Scilit: