Optimization of constrained frequent set queries with 2-variable constraints
- 1 June 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM SIGMOD Record
- Vol. 28 (2) , 157-168
- https://doi.org/10.1145/304181.304196
Abstract
Currently, there is tremendous interest in providing ad-hoc mining capabilities in database management systems. As a first step towards this goal, in [15] we proposed an architecture for supporting constraint-based, human-centered, exploratory mining of various kinds of rules including associations, introduced the notion of constrained frequent set queries (CFQs), and developed effective pruning optimizations for CFQs with 1-variable (1-var) constraints. While 1-var constraints are useful for constraining the antecedent and consequent separately, many natural examples of CFQs illustrate the need for constraining the antecedent and consequent jointly, for which 2-variable (2-var) constraints are indispensable. Developing pruning optimizations for CFQs with 2-var constraints is the subject of this paper. But this is a difficult problem because: (i) in 2-var constraints, both variables keep changing and, unlike 1-var constraints, there is no fixed target for pruning; (ii) as we show, “conventional” monotonicity-based optimization techniques do not apply effectively to 2-var constraints. The contributions are as follows. (1) We introduce a notion of quasi-succinctness , which allows a quasi-succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning. (2) We characterize the class of 2-var constraints that are quasi-succinct. (3) We develop heuristic techniques for non-quasi-succinct constraints. Experimental results show the effectiveness of all our techniques. (4) We propose a query optimizer for CFQs and show that for a large class of constraints, the computation strategy generated by the optimizer is ccc-optimal , i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.Keywords
This publication has 13 references indexed in Scilit:
- Efficiently mining long patterns from databasesPublished by Association for Computing Machinery (ACM) ,1998
- Integrating association rule mining with relational database systemsPublished by Association for Computing Machinery (ACM) ,1998
- Exploratory mining and pruning optimizations of constrained associations rulesPublished by Association for Computing Machinery (ACM) ,1998
- Beyond market basketsPublished by Association for Computing Machinery (ACM) ,1997
- Association rules over interval dataPublished by Association for Computing Machinery (ACM) ,1997
- A database perspective on knowledge discoveryCommunications of the ACM, 1996
- Data mining using two-dimensional optimized association rulesPublished by Association for Computing Machinery (ACM) ,1996
- An effective hash-based algorithm for mining association rulesPublished by Association for Computing Machinery (ACM) ,1995
- Finding interesting rules from large sets of discovered association rulesPublished by Association for Computing Machinery (ACM) ,1994
- Mining association rules between sets of items in large databasesPublished by Association for Computing Machinery (ACM) ,1993