Finding functional sequence elements by multiple local alignment
- 2 January 2004
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (1) , 189-200
- https://doi.org/10.1093/nar/gkh169
Abstract
Algorithms that detect and align locally similar regions of biological sequences have the potential to discover a wide variety of functional motifs. Two theoretical contributions to this classic but unsolved problem are presented here: a method to determine the width of the aligned motif automatically; and a technique for calculating the statistical significance of alignments, i.e. an assessment of whether the alignments are stronger than those that would be expected to occur by chance among random, unrelated sequences. Upon exploring variants of the standard Gibbs sampling technique to optimize the alignment, we discovered that simulated annealing approaches perform more efficiently. Finally, we conduct failure tests by applying the algorithm to increasingly difficult test cases, and analyze the manner of and reasons for eventual failure. Detection of transcription factor-binding motifs is limited by the motifs' intrinsic subtlety rather than by inadequacy of the alignment optimization procedure.Keywords
This publication has 40 references indexed in Scilit:
- TRANSFAC: an integrated system for gene expression regulationNucleic Acids Research, 2000
- Finding the most significant common sequence and structure motifs in a set of RNA sequencesNucleic Acids Research, 1997
- [33] Analysis of compositionally biased regions in sequence databasesPublished by Elsevier ,1996
- [27] Local alignment statisticsPublished by Elsevier ,1996
- The value of prior knowledge in discovering motifs with MEME.1995
- Fitting a mixture model by expectation maximization to discover motifs in biopolymers.1994
- The crystal structure of the estrogen receptor DNA-binding domain bound to DNA: How receptors discriminate between their response elementsCell, 1993
- Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple AlignmentScience, 1993
- Using Dirichlet mixture priors to derive hidden Markov models for protein families.1993
- A workbench for multiple alignment construction and analysisProteins-Structure Function and Bioinformatics, 1991