Sampling rare events: statistics of local sequence alignments
Preprint
- 13 August 2001
Abstract
A new method to simulate probability distributions in regions where the events are VERY unlikely (e.g. p ~ 10^{-40}) is presented. The basic idea is to represent the underlying probability space by the phase space of a physical system. The system is held at a temperature T, which is chosen such that the system preferably generates configurations which originally have low probabilities. Since the distribution of such a physical system is know from statistical physics, the original unbiased distribution can be obtained. As an application, local alignment of protein sequences based on BLOSUM62 substitution scores with (12,1) affine gap costs are considered The distribution of optimum sequence-alignment scores S is studied numerically over a large range of scores. The deviation of p(S) from the extreme-value (or Gumbel) distribution is quantified. This deviation decreases with growing sequence length.Keywords
All Related Versions
- Version 1, 2001-08-13, ArXiv
- Published version: Physical Review E, 65 (5), 056102.
This publication has 0 references indexed in Scilit: