Approximations to Profile Score Distributions
- 1 January 1994
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 1 (2) , 93-104
- https://doi.org/10.1089/cmb.1994.1.93
Abstract
Profiles, which are summaries of multiple alignments of a sequence family, are used to find new instances of the family in databases. In this paper, we study the maximum score M obtained when the profile is aligned without indels at all possible positions of a random sequence. The main theorem gives an approximation to the distribution function of M with an explicit bound on the error. This theorem implies that M has a limiting extreme value distribution.Keywords
This publication has 16 references indexed in Scilit:
- Efficient methods for multiple sequence alignment with guaranteed error boundsBulletin of Mathematical Biology, 1993
- Poisson Approximation and the Chen-Stein MethodStatistical Science, 1990
- Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.Proceedings of the National Academy of Sciences, 1990
- Two Moments Suffice for Poisson Approximations: The Chen-Stein MethodThe Annals of Probability, 1989
- The Multiple Sequence Alignment Problem in BiologySIAM Journal on Applied Mathematics, 1988
- Stochastic scrabble: large deviations for sequences with scoresJournal of Applied Probability, 1988
- Profile analysis: detection of distantly related proteins.Proceedings of the National Academy of Sciences, 1987
- Simian Sarcoma Virus onc Gene, v- sis , Is Derived from the Gene (or Genes) Encoding a Platelet-Derived Growth FactorScience, 1983
- Similar Amino Acid Sequences: Chance or Common Ancestry?Science, 1981
- Estimating probabilities for normal extremesAdvances in Applied Probability, 1980