Substring selectivity estimation

1 May 1999

proceedings article
Published by Association for Computing Machinery (ACM)

p. 249-260
https://doi.org/10.1145/303976.304001

Abstract

With the explosion of the Internet, LDAP directoriesand XML, there is an ever greater need to evaluatequeries involving (sub)string matching. Effective queryoptimization in this context requires good selectivityestimates. In this paper, we use pruned count-suffixtrees as the basic framework for substring selectivityestimation.We present a novel technique to obtain a good estimatefor a given substring matching query, called MO(for Maximal Overlap), that estimates the selectivity ofa...

Keywords

This publication has 9 references indexed in Scilit:

Improved histograms for selectivity estimation of range predicates
Published by Association for Computing Machinery (ACM) ,1996
Estimating alphanumeric selectivity in the presence of wildcards
Published by Association for Computing Machinery (ACM) ,1996
The merge/purge problem for large databases
Published by Association for Computing Machinery (ACM) ,1995
Balancing histogram optimality and practicality for query result size estimation
Published by Association for Computing Machinery (ACM) ,1995
Query size estimation by adaptive sampling (extended abstract)
Published by Association for Computing Machinery (ACM) ,1990
Equi-depth multidimensional histograms
Published by Association for Computing Machinery (ACM) ,1988
Access path selection in a relational database management system
Published by Association for Computing Machinery (ACM) ,1979
A Space-Economical Suffix Tree Construction Algorithm
Journal of the ACM, 1976
Linear pattern matching algorithms
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1973