Substring selectivity estimation

Abstract
With the explosion of the Internet, LDAP directoriesand XML, there is an ever greater need to evaluatequeries involving (sub)string matching. Effective queryoptimization in this context requires good selectivityestimates. In this paper, we use pruned count-suffixtrees as the basic framework for substring selectivityestimation.We present a novel technique to obtain a good estimatefor a given substring matching query, called MO(for Maximal Overlap), that estimates the selectivity ofa...

This publication has 9 references indexed in Scilit: