A critical review of multi-objective optimization in data mining
- 1 December 2004
- journal article
- review article
- Published by Association for Computing Machinery (ACM) in ACM SIGKDD Explorations Newsletter
- Vol. 6 (2) , 77-86
- https://doi.org/10.1145/1046456.1046467
Abstract
This paper addresses the problem of how to evaluate the quality of a model built from the data in a multi-objective optimization scenario, where two or more quality criteria must be simultaneously optimized. A typical example is a scenario where one wants to maximize both the accuracy and the simplicity of a classification model or a candidate attribute subset in attribute selection. One reviews three very different approaches to cope with this problem, namely: (a) transforming the original multi-objective problem into a single-objective problem by using a weighted formula; (b) the lexicographical approach, where the objectives are ranked in order of priority; and (c) the Pareto approach, which consists of finding as many non-dominated solutions as possible and returning the set of non-dominated solutions to the user. One also presents a critical review of the case for and against each of these approaches. The general conclusions are that the weighted formula approach -- which is by far the most used in the data mining literature -- is to a large extent an ad-hoc approach for multi-objective optimization, whereas the lexicographic and the Pareto approach are more principled approaches, and therefore deserve more attention from the data mining community.Keywords
This publication has 8 references indexed in Scilit:
- Knowledge-based data miningPublished by Association for Computing Machinery (ACM) ,2003
- Efficiently handling feature redundancy in high-dimensional dataPublished by Association for Computing Machinery (ACM) ,2003
- Rule quality for multiple-rule classifier: Empirical expertise and theoretical methodology1Intelligent Data Analysis, 2003
- Construct robust rule sets for classificationPublished by Association for Computing Machinery (ACM) ,2002
- Feature selection in unsupervised learning via evolutionary searchPublished by Association for Computing Machinery (ACM) ,2000
- Evolutionary algorithms in data miningPublished by Association for Computing Machinery (ACM) ,2000
- The use of the area under the ROC curve in the evaluation of machine learning algorithmsPattern Recognition, 1997
- Inferring decision trees using the minimum description lenght principleInformation and Computation, 1989