Detecting periodic patterns in biological sequences.

Abstract
MOTIVATION: The search for repeated patterns in DNA and protein sequences is important in sequence analysis. The rapid increase in available sequences, in particular from large-scale genome sequencing projects, makes it relevant to develop sensitive automatic methods for the identification of repeats. RESULTS: A new method for finding periodic patterns in biological sequences is presented. The method is based on evolutionary distance and 'phase shifts' corresponding to insertions and deletions. A given sequence is aligned to itself in a certain sense, trying to minimize a distance to periodicity. Relationships between different such periodicity measures are discussed. An iterative algorithm is used, and the running time is nearly proportional to the sequence length. The alignment produces a periodic consensus pattern. A 'phase score' is used to indicate a statistical significance of the periodicity. Three examples using both DNA and protein sequences illustrate how the method can be used to find patterns. AVAILABILITY: On request from the authors. CONTACT: evindc@mat nu.no; finn.drablos@unimed.sintef.no

This publication has 0 references indexed in Scilit: