Optimization of constrained frequent set queries with 2-variable constraints

1 June 1999

journal article
Published by Association for Computing Machinery (ACM) in ACM SIGMOD Record

Vol. 28 (2) , 157-168
https://doi.org/10.1145/304181.304196

Abstract

Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints. While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, “conventional” monotonicity-based optimization techniques do not apply effectively to 2-var constraints. The contributions are as follows. (1) We introduce a notion of quasi-succinctness , which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal , i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.

Keywords

This publication has 13 references indexed in Scilit:

Efficiently mining long patterns from databases
Published by Association for Computing Machinery (ACM) ,1998
Integrating association rule mining with relational database systems
Published by Association for Computing Machinery (ACM) ,1998
Exploratory mining and pruning optimizations of constrained associations rules
Published by Association for Computing Machinery (ACM) ,1998
Beyond market baskets
Published by Association for Computing Machinery (ACM) ,1997
Association rules over interval data
Published by Association for Computing Machinery (ACM) ,1997
A database perspective on knowledge discovery
Communications of the ACM, 1996
Data mining using two-dimensional optimized association rules
Published by Association for Computing Machinery (ACM) ,1996
An effective hash-based algorithm for mining association rules
Published by Association for Computing Machinery (ACM) ,1995
Finding interesting rules from large sets of discovered association rules
Published by Association for Computing Machinery (ACM) ,1994
Mining association rules between sets of items in large databases
Published by Association for Computing Machinery (ACM) ,1993