Adaptive and resource-aware mining of frequent sets
- 26 June 2003
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
The performance of an algorithm that mines frequent sets from transactional databases may severely depend on the specific features of the data being analyzed. Moreover, some architectural characteristics of the computational platform used - e.g. the available main memory - can dramatically change its runtime behavior. In this paper we present DCI (Direct Count & Intersect), an efficient algorithm for discovering frequent sets from large databases. Due to the multiple heuristics strategies adopted, DCI can adapt its behavior not only to the features of the specific computing platform, but also to the features of the datasetbeing mined, so that it results very effective in mining both short and long patterns from sparse and dense datasets. Finally we also discuss the parallelization strategies adopted in the design of ParDCI, a distributed and multi-threaded implementation of DCI.Keywords
This publication has 9 references indexed in Scilit:
- Real world performance of association rule algorithmsPublished by Association for Computing Machinery (ACM) ,2001
- Mining frequent patterns with counting inferenceACM SIGKDD Explorations Newsletter, 2000
- Depth first generation of long patternsPublished by Association for Computing Machinery (ACM) ,2000
- Mining frequent patterns without candidate generationPublished by Association for Computing Machinery (ACM) ,2000
- Scalable algorithms for association miningIEEE Transactions on Knowledge and Data Engineering, 2000
- Data organization and access for efficient data miningPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1999
- Efficiently mining long patterns from databasesPublished by Association for Computing Machinery (ACM) ,1998
- Parallel mining of association rulesIEEE Transactions on Knowledge and Data Engineering, 1996
- An effective hash-based algorithm for mining association rulesPublished by Association for Computing Machinery (ACM) ,1995