The Efficient Computation of Position-Specific Match Scores with the Fast Fourier Transform
- 1 January 2002
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 9 (1) , 23-33
- https://doi.org/10.1089/10665270252833172
Abstract
Historically, in computational biology the fast Fourier transform (FFT) has been used almost exclusively to count the number of exact letter matches between two biosequences. This paper presents an FFT algorithm that can compute the match score of a sequence against a position-specific scoring matrix (PSSM). Our algorithm finds the PSSM score simultaneously over all offsets of the PSSM with the sequence, although like all previous FFT algorithms, it still disallows gaps. Although our algorithm is presented in the context of global matching, it can be adapted to local matching without gaps. As a benchmark, our PSSM-modified FFT algorithm computed pairwise match scores. In timing experiments, our most efficient FFT implementation for pairwise scoring appeared to be 10 to 26 times faster than a traditional FFT implementation, with only a factor of 2 in the acceleration attributable to a previously known compression scheme. Many important algorithms for detecting biosequence similarities, e.g., gapped BLAST or PSIBLAST, have a heuristic screening phase that disallows gaps. This paper demonstrates that FFT algorithms merit reconsideration in these screening applications.Keywords
This publication has 25 references indexed in Scilit:
- Hydrophobicity scales and computational techniques for detecting amphipathic structures in proteinsPublished by Elsevier ,2005
- Amino acid substitution matrices from an information theoretic perspectivePublished by Elsevier ,2005
- Iterated profile searches with PSI-BLAST—a tool for discovery in protein databasesTrends in Biochemical Sciences, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Search of hidden periodicities in DNA sequencesJournal of Theoretical Biology, 1995
- Issues in searching molecular sequence databasesNature Genetics, 1994
- A protein alignment scoring system sensitive at all evolutionary distancesJournal of Molecular Evolution, 1993
- Fourier methods for biosequence analysisNucleic Acids Research, 1990
- Aligning amino acid sequences: Comparison of commonly used methodsJournal of Molecular Evolution, 1985
- An efficient method for matching nucleic acid sequencesNucleic Acids Research, 1982