Efficient algorithms for mining closed itemsets and their lattice structure
Top Cited Papers
- 7 March 2005
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Knowledge and Data Engineering
- Vol. 17 (4) , 462-478
- https://doi.org/10.1109/tkde.2005.60
Abstract
The set of frequent closed itemsets uniquely determines the exact frequency of all itemsets, yet it can be orders of magnitude smaller than the set of all frequent itemsets. In this paper, we present CHARM, an efficient algorithm for mining all frequent closed itemsets. It enumerates closed sets using a dual itemset-tidset search tree, using an efficient hybrid search that skips many levels. It also uses a technique called diffsets to reduce the memory footprint of intermediate computations. Finally, it uses a fast hash-based approach to remove any "nonclosed" sets found during computation. We also present CHARM-L, an algorithm that outputs the closed itemset lattice, which is very useful for rule generation and visualization. An extensive experimental evaluation on a number of real and synthetic databases shows that CHARM is a state-of-the-art algorithm that outperforms previous methods. Further, CHARM-L explicitly generates the frequent closed itemset lattice.Keywords
This publication has 16 references indexed in Scilit:
- CLOSET+Published by Association for Computing Machinery (ACM) ,2003
- Fast vertical mining using diffsetsPublished by Association for Computing Machinery (ACM) ,2003
- Efficiently mining maximal frequent itemsetsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- MAFIA: a maximal frequent itemset algorithm for transactional databasesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Mining frequent patterns with counting inferenceACM SIGKDD Explorations Newsletter, 2000
- Depth first generation of long patternsPublished by Association for Computing Machinery (ACM) ,2000
- Mining frequent patterns without candidate generationPublished by Association for Computing Machinery (ACM) ,2000
- Scalable algorithms for association miningIEEE Transactions on Knowledge and Data Engineering, 2000
- A fast algorithm for building latticesInformation Processing Letters, 1999
- An effective hash-based algorithm for mining association rulesACM SIGMOD Record, 1995