Finding interesting associations without support pruning
- 1 January 2001
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Knowledge and Data Engineering
- Vol. 13 (1) , 64-78
- https://doi.org/10.1109/69.908981
Abstract
Association-rule mining has heretofore relied on the condition of high support to do its work efficiently. In particular, the well-known a priori algorithm is only effective when the only rules of interest are relationships that occur very frequently. However, there are a number of applications, such as data mining, identification of similar Web documents, clustering, and collaborative filtering, where the rules of interest have comparatively few instances in the data. In these cases, we must look for highly correlated items, or possibly even causal relationships between infrequent items. We develop a family of algorithms for solving this problem, employing a combination of random sampling and hashing techniques. We provide analysis of the algorithms developed and conduct experiments on real and synthetic data to obtain a comparative performance analysis.Keywords
This publication has 9 references indexed in Scilit:
- On the resemblance and containment of documentsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Approximate nearest neighborsPublished by Association for Computing Machinery (ACM) ,1998
- Size-Estimation Framework with Applications to Transitive Closure and ReachabilityJournal of Computer and System Sciences, 1997
- Online aggregationACM SIGMOD Record, 1997
- Dynamic itemset counting and implication rules for market basket dataPublished by Association for Computing Machinery (ACM) ,1997
- Building a scalable and accurate copy detection mechanismPublished by Association for Computing Machinery (ACM) ,1996
- Randomized AlgorithmsPublished by Cambridge University Press (CUP) ,1995
- Mining association rules between sets of items in large databasesPublished by Association for Computing Machinery (ACM) ,1993
- Using collaborative filtering to weave an information tapestryCommunications of the ACM, 1992