Feature selection via set cover
- 22 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 165-171
- https://doi.org/10.1109/kdex.1997.629862
Abstract
In pattern classification, features are used to define classes. Feature selection is a preprocessing process that searches for an "optimal" subset of features. The class separability is normally used as the basic feature selection criterion. Instead of maximizing the class separability, as in the literature, this work adopts a criterion aiming to maintain the discriminating power of the data describing its classes. In other words, the problem is formalized as that of finding the smallest set of features that is "consistent" in describing classes. We describe a multivariate measure of feature consistency. The new feature selection algorithm is based on Johnson's (1974) algorithm for set covering. Johnson's analysis implies that this algorithm runs in polynomial time, and outputs a consistent feature set whose size is within a log factor of the best possible. Our experiments show that its performance in practice is much better than this, and that it outperforms earlier methods using a similar amount of time.Keywords
This publication has 7 references indexed in Scilit:
- Approximation algorithms for combinatorial problemsPublished by Elsevier ,2007
- Feature selection for classificationIntelligent Data Analysis, 1997
- Learning Boolean concepts in the presence of many irrelevant featuresArtificial Intelligence, 1994
- Efficiently Inducing Determinations: A Complete and Systematic Search Algorithm that Uses Optimal PruningPublished by Elsevier ,1993
- Constructive Induction Using a Non-Greedy Strategy for Feature SelectionPublished by Elsevier ,1992
- ON AUTOMATIC FEATURE SELECTIONInternational Journal of Pattern Recognition and Artificial Intelligence, 1988
- A Branch and Bound Algorithm for Feature Subset SelectionIEEE Transactions on Computers, 1977