Introduction to Suffix Trees

28 May 1997

book chapter
Published by Cambridge University Press (CUP)

p. 89-93
https://doi.org/10.1017/cbo9780511574931.007

Abstract

A suffix tree is a data structure that exposes the internal structure of a string in a deeper way than does the fundamental preprocessing discussed in Section 1.3. Suffix trees can be used to solve the exact matching problem in linear time (achieving the same worst-case bound that the Knuth-Morris-Pratt and the Boyer–Moore algorithms achieve), but their real virtue comes from their use in linear-time solutions to many string problems more complex than exact matching. Moreover (as we will detail in Chapter 9), suffix trees provide a bridge between exact matching problems, the focus of Part I, and inexact matching problems that are the focus of Part III. The classic application for suffix trees is the substring problem. One is first given a text T of length m. After O(m), or linear, preprocessing time, one must be prepared to take in any unknown string S of length n and in O(n) time either find an occurrence of S in T or determine that S is not contained in T. That is, the allowed preprocessing takes time proportional to the length of the text, but thereafter, the search for S must be done in time proportional to the length of S, independent of the length of T. These bounds are achieved with the use of a suffix tree. The suffix tree for the text is built in O(m) time during a preprocessing stage; thereafter, whenever a string of length O(n) is input, the algorithm searches for it in O(n) time using that suffix tree.

Keywords

This publication has 0 references indexed in Scilit: