Learning Simple Concepts under Simple Distributions
- 1 October 1991
- journal article
- Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Computing
- Vol. 20 (5) , 911-935
- https://doi.org/10.1137/0220056
Abstract
This is a preliminary draft version. The journal version (SIAM. J. Computing, 20:5(1991), 911- 935) is the correct final version. However, the polynomial time computable universal distribution section in there is too sloppy. For a better treatment see ''M. Li and P.M.B. Vitanyi, An Introduc- tion to Kolmogorov Complexity and its Applications, Springer-Verlag, New York, Second Edi- tion, 1997,'' Section 7.6, pp. 506-509. We aim at developing a learning theory where 'simple' concepts are easily learnable. In Valiant's learning model, many concepts turn out to be too hard (like NP hard) to learn. Rela- tively few concept classes were shown to be learnable polynomially. In daily life, it seems that things we care to learn are usually learnable. To model the intuitive notion of learning more closely, we do not require that the learning algorithm learns (polynomially) under all distribu- tions, but only under all simple distributions. A distribution is simple if it is dominated by an enumerable distribution. All distributions with computable parameters which are used in statistics are simple. Simple distributions are complete in the sense that a concept class is learnable under all simple distributions iff it is learnable under a fixed 'universal' simple distribution. This holds both for polynomial learning in the discrete case, (under a modified model), and for non-time- restricted learning in the continuous case (under the usual model). We use this completeness result to obtain new learning algorithms and several quite general new learnable classes. These include a discrete class which is known to be not polynomial learnable under Valiant's model, unless P = NP, and a continous class which is not learnable in Valiant's model. Our results allow that for each concept class from a wide range of concept classes, for each underlying distribution from a wide range of distributions, the learning algorithm uses a single fixed procedure to draw examples by a single algorithmic process using a random number generator. The 'universal' sim- ple distribution is not computable. To make our theory feasible, we develop a polynomial time version for it. All results derived for discrete sample spaces hold mutatis mutandis for the poly- nomial time versions, including versions of completeness, the new learning algorithms and the new learnable classes.Keywords
This publication has 13 references indexed in Scilit:
- Approximation algorithms for combinatorial problemsPublished by Elsevier ,2007
- Kolmogorov Complexity and its ApplicationsPublished by Elsevier ,1990
- Computational limitations on learning from examplesJournal of the ACM, 1988
- Occam's RazorInformation Processing Letters, 1987
- Learning Decision ListsMachine Learning, 1987
- Inference of Reversible LanguagesJournal of the ACM, 1982
- On the complexity of minimum inference of regular setsInformation and Control, 1978
- Complexity of automaton identification from given dataInformation and Control, 1978
- On the ratio of optimal integral and fractional coversDiscrete Mathematics, 1975
- On the Synthesis of Finite-State Machines from Samples of Their BehaviorIEEE Transactions on Computers, 1972