Learning Simple Concepts under Simple Distributions

1 October 1991

journal article
Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Computing

Vol. 20 (5) , 911-935
https://doi.org/10.1137/0220056

Abstract

This is a preliminary draft version. The journal version (SIAM. J. Computing, 20:5(1991), 911- 935) is the correct final version. However, the polynomial time computable universal distribution section in there is too sloppy. For a better treatment see ''M. Li and P.M.B. Vitanyi, An Introduc- tion to Kolmogorov Complexity and its Applications, Springer-Verlag, New York, Second Edi- tion, 1997,'' Section 7.6, pp. 506-509. We aim at developing a learning theory where 'simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Rela- tively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, we do not require that the learning algorithm learns (polynomially) under all distribu- tions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distribution. All distributions with computable parameters which are used in statistics are simple. Simple distributions are complete in the sense that a concept class is learnable under all simple distributions iff it is learnable under a fixed 'universal' simple distribution. This holds both for polynomial learning in the discrete case, (under a modified model), and for non-time- restricted learning in the continuous case (under the usual model). We use this completeness result to obtain new learning algorithms and several quite general new learnable classes. These include a discrete class which is known to be not polynomial learnable under Valiant's model, unless P = NP, and a continous class which is not learnable in Valiant's model. Our results allow that for each concept class from a wide range of concept classes, for each underlying distribution from a wide range of distributions, the learning algorithm uses a single fixed procedure to draw examples by a single algorithmic process using a random number generator. The 'universal' sim- ple distribution is not computable. To make our theory feasible, we develop a polynomial time version for it. All results derived for discrete sample spaces hold mutatis mutandis for the poly- nomial time versions, including versions of completeness, the new learning algorithms and the new learnable classes.

Keywords

This publication has 13 references indexed in Scilit:

Approximation algorithms for combinatorial problems
Published by Elsevier ,2007
Kolmogorov Complexity and its Applications
Published by Elsevier ,1990
Computational limitations on learning from examples
Journal of the ACM, 1988
Occam's Razor
Information Processing Letters, 1987
Learning Decision Lists
Machine Learning, 1987
Inference of Reversible Languages
Journal of the ACM, 1982
On the complexity of minimum inference of regular sets
Information and Control, 1978
Complexity of automaton identification from given data
Information and Control, 1978
On the ratio of optimal integral and fractional covers
Discrete Mathematics, 1975
On the Synthesis of Finite-State Machines from Samples of Their Behavior
IEEE Transactions on Computers, 1972