Approximations to Profile Score Distributions

Abstract
Profiles, which are summaries of multiple alignments of a sequence family, are used to find new instances of the family in databases. In this paper, we study the maximum score M obtained when the profile is aligned without indels at all possible positions of a random sequence. The main theorem gives an approximation to the distribution function of M with an explicit bound on the error. This theorem implies that M has a limiting extreme value distribution.