A statistically based system for prioritizing information exploration under uncertainty
- 1 July 1997
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans
- Vol. 27 (4) , 449-466
- https://doi.org/10.1109/3468.594912
Abstract
This paper examines the problem of prioritizing actions under uncertainty. Our motivating applications come from the domain of data mining. Data mining problems present the user with a huge collection of individual items (e.g., abstracts, medical histories, and computer users' command histories) and require that these items be prioritized according to which should be pursued thoroughly. More precisely, each data item is assumed to be generated by one of two processes: A large majority of the data comes from a common, mundane process and a very small fraction comes from a rare, phenomenon process. The problem is to rank the information so as to optimally direct the user in his or her pursuit of the data items that were generated by the phenomenon process. Our previous work has developed the theoretical foundations of the information prioritization problem. The current paper summarizes these foundations, derives new theoretical results, and details initial experimental results of a prioritization system based on the theory. We focus here on feature selection techniques and the method of model surrogates, each tailored to the classes of prioritization applications of greatest current interest. Our results demonstrate the effectiveness of the techniques and motivate further research to improve the existing systemKeywords
This publication has 5 references indexed in Scilit:
- Statistical foundations of audit trail analysis for the detection of computer misuseIEEE Transactions on Software Engineering, 1993
- Optimization by Simulated Annealing: An Experimental Evaluation; Part I, Graph PartitioningOperations Research, 1989
- Learning quickly when irrelevant attributes abound: A new linear-threshold algorithmMachine Learning, 1988
- A method for attribute selection in inductive learning systemsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1988
- The Population Frequencies of Species and the Estimation of Population ParametersBiometrika, 1953