DEMON: mining and monitoring evolving data
- 1 January 2001
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Knowledge and Data Engineering
- Vol. 13 (1) , 50-63
- https://doi.org/10.1109/69.908980
Abstract
Data mining algorithms have been the focus of much research. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-to-date through periodic or occasional addition and deletion of blocks of data. Most data mining algorithms have either assumed that the input data is static, or have been designed for arbitrary insertions and deletions of data records. We consider a dynamic environment that evolves through systematic addition or deletion of blocks of data. We introduce a new dimension, called the data span dimension, which allows user-defined selections of a temporal subset of the database. Taking this new degree of freedom into account, we describe efficient model maintenance algorithms for frequent item sets and clusters. We then describe a generic algorithm that takes any traditional incremental model maintenance algorithm and transforms it into an algorithm that allows restrictions on the data span dimension. We also develop an algorithm for automatically discovering a specific class of interesting block selection sequences. In a detailed experimental study, we examine the validity and performance of our ideas on synthetic and real datasets.Keywords
This publication has 12 references indexed in Scilit:
- Cure: an efficient clustering algorithm for large databasesInformation Systems, 2001
- BOAT—optimistic decision tree constructionPublished by Association for Computing Machinery (ACM) ,1999
- A framework for measuring changes in data characteristicsPublished by Association for Computing Machinery (ACM) ,1999
- An overview of data warehousing and OLAP technologyACM SIGMOD Record, 1997
- A General Incremental Technique for Maintaining Discovered Association RulesPublished by World Scientific Pub Co Pte Ltd ,1997
- Parallel Algorithms for Discovery of Association RulesData Mining and Knowledge Discovery, 1997
- BIRCHACM SIGMOD Record, 1996
- Maintenance of discovered association rules in large databases: an incremental updating techniquePublished by Institute of Electrical and Electronics Engineers (IEEE) ,1996
- ID5: An Incremental ID3Published by Elsevier ,1988
- Recent trends in hierarchic document clustering: A critical reviewInformation Processing & Management, 1988