Data mining, hypergraph transversals, and machine learning (extended abstract)

1 May 1997

proceedings article
Published by Association for Computing Machinery (ACM)

p. 209-216
https://doi.org/10.1145/263661.263684

Abstract

Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with the hypergraph transversal problem. We then analyze two algorithms that have been previously used in data mining, proving upper bounds on their complexity. The first algorithm is useful when the maximally specific interesting sentences are "small". We show that this al- gorithm can also be used to efficiently solve a special case of the hypergraph transversal problem, improving on previ- ous results. The second algorithm utilizes a subroutine for hypergraph transversals, and is applicable in more general situations, with complexity close to a lower bound for the problem. We also relate these problems to the model of ex- act learning in computational learning theory, and use the correspondence to derive some corollaries.

Keywords

This publication has 9 references indexed in Scilit:

On the Complexity of Dualization of Monotone Disjunctive Normal Forms
Journal of Algorithms, 1996
Oracles and Queries That Are Sufficient for Exact Learning
Journal of Computer and System Sciences, 1996
Identifying the Minimal Transversals of a Hypergraph and Related Problems
SIAM Journal on Computing, 1995
Efficient discovery of interesting statements in databases
Journal of Intelligent Information Systems, 1995
First-order jk-clausal theories are PAC-learnable
Artificial Intelligence, 1994
Algorithms for inferring functional dependencies from relations
Data & Knowledge Engineering, 1994
Mining association rules between sets of items in large databases
Published by Association for Computing Machinery (ACM) ,1993
Queries and Concept Learning
Machine Learning, 1988
Design by example: An application of Armstrong relations
Journal of Computer and System Sciences, 1986