Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm

1 October 1987

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 1 (02725428) , 68-77
https://doi.org/10.1109/sfcs.1987.37

Abstract

Valiant and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss on-line learning of these functions. In on-line learning, the learner responds to each example according to a current hypothesis. Then the learner updates the hypothesis, if necessary, based on the correct classification of the example. One natural measure of the quality of learning in the on-line setting is the number of mistakes the learner makes. For suitable classes of functions, on-line learning algorithms are available that make a bounded number of mistakes, with the bound independent of the number of examples seen by the learner. We present one such algorithm, which learns disjunctive Boolean functions, and variants of the algorithm for learning other classes of Boolean functions. The algorithm can be expressed as a linear-threshold algorithm. A primary advantage of this algorithm is that the number of mistakes that it makes is relatively little affected by the presence of large numbers of irrelevant attributes in the examples; we show that the number of mistakes grows only logarithmically with the number of irrelevant attributes. At the same time, the algorithm is computationaUy time and space efficient.

Keywords

This publication has 9 references indexed in Scilit:

Occam's Razor
Information Processing Letters, 1987
Recent Results on Boolean Concept Learning
Published by Elsevier ,1987
Linear function neurons: Structure and training
Biological Cybernetics, 1986
Parallel Distributed Processing
Published by MIT Press ,1986
The Logic of Learning: A Basis for Pattern Recognition and for Improvement of Performance
Published by Elsevier ,1985
A theory of the learnable
Communications of the ACM, 1984
Inductive Inference: Theory and Methods
ACM Computing Surveys, 1983
Generalization as search
Artificial Intelligence, 1982
On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities
Theory of Probability and Its Applications, 1971