FORRepeats: detects repeats on entire chromosomes and between genomes
- 12 February 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 19 (3) , 319-326
- https://doi.org/10.1093/bioinformatics/btf843
Abstract
As more and more whole genomes are available, there is a need for new methods to compare large sequences and transfer biological knowledge from annotated genomes to related new ones. BLAST is not suitable to compare multimegabase DNA sequences. MegaBLAST is designed to compare closely related large sequences. Some tools to detect repeats in large sequences have already been developed such as MUMmer or REPuter. They also have time or space restrictions. Moreover, in terms of applications, REPuter only computes repeats and MUMmer works better with related genomes. We present a heuristic method, named FORRepeats, which is based on a novel data structure called factor oracle. In the first step it detects exact repeats in large sequences. Then, in the second step, it computes approximate repeats and performs pairwise comparison. We compared its computational characteristics with BLAST and REPuter. Results demonstrate that it is fast and space economical. We show FORRepeats ability to perform intra-genomic comparison and to detect repeated DNA sequences in the complete genome of the model plant Arabidopsis thaliana.Keywords
This publication has 0 references indexed in Scilit: