H-mine: hyper-structure mining of frequent patterns in large databases
- 14 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 441-448
- https://doi.org/10.1109/icdm.2001.989550
Abstract
Methods for efficient mining of frequent patterns have been studied extensively by many researchers. However, the previously proposed methods still encounter someperformance bottlenecks when mining databases with different data characteristics, such as dense vs. sparse, long vs. short patterns, memory-based vs. disk-based, etc.In this study, we propose a simple and novel hyper-linkeddata structure, H-struct , and a new mining algorithm, H-mine ,which takes advantage of this data structure anddynamically adjusts links in the mining process. A distinct feature of this method is that it has very limitedand precisely predictable space overhead and runs really fast in memory-based setting. Moreover, it ca be scaled up to very large databases by database partitioning, and whenthe data set becomes dense,(conditional)FP-trees can be constructed dynamically as part of the mining process. Our study shows that H-mine has high performance in various kinds of data, outperforms the previously developedalgorithms in different settings, and is highly scalable in mining large databases. This study also proposes a new datamining methodology, space-preserving mining ,which mayhave strong impact in the future development of efficient and scalable data mining methods.Keywords
This publication has 5 references indexed in Scilit:
- Pushing Convertible Constraints in Frequent Itemset MiningData Mining and Knowledge Discovery, 2004
- Generating non-redundant association rulesPublished by Association for Computing Machinery (ACM) ,2000
- Mining frequent patterns without candidate generationPublished by Association for Computing Machinery (ACM) ,2000
- Automatic subspace clustering of high dimensional data for data mining applicationsPublished by Association for Computing Machinery (ACM) ,1998
- Exploratory mining and pruning optimizations of constrained associations rulesPublished by Association for Computing Machinery (ACM) ,1998