Optimization of a language for data mining

9 March 2003

proceedings article
Published by Association for Computing Machinery (ACM)

p. 437-444
https://doi.org/10.1145/952532.952619

Abstract

Constraint-based mining has attracted in recent years the interest of the data mining research community because it increases the relevance of the result set, reduces its volume and the amount of workload. However, constrained-based mining will be completely feasible only when efficient optimizers for mining languages will be available.This paper is a first step towards the construction of optimizers for a constraint-based mining language. It provides the guidelines for the comparison of classes of statements by means of the relationships existing between their result sets. Furthermore it identifies as useful information to the optimization the presence of unique constraints and functional dependencies in the schema of the database. We show the practical implications of the discussed principles with a set of algorithms designed for a specific mining language. These algorithms use also a new designed index, called mining index that allows to reduce the portion of the database to be read in response to some classes of queries. In these cases the workload of the mining engine is greatly reduced or completely avoided in a significant subset of the cases.

Keywords

This publication has 7 references indexed in Scilit:

Can we push more constraints into frequent pattern mining?
Published by Association for Computing Machinery (ACM) ,2000
Optimization of constrained frequent set queries with 2-variable constraints
Published by Association for Computing Machinery (ACM) ,1999
Exploratory mining and pruning optimizations of constrained associations rules
Published by Association for Computing Machinery (ACM) ,1998
On the Complexity of Mining Quantitative Association Rules
Data Mining and Knowledge Discovery, 1998
An Extension to SQL for Mining Association Rules
Data Mining and Knowledge Discovery, 1998
Levelwise Search and Borders of Theories in Knowledge Discovery
Data Mining and Knowledge Discovery, 1997
A database perspective on knowledge discovery
Communications of the ACM, 1996