The expected number of matches in optimal global sequence alignments

1 July 1993

journal article
research article
Published by Taylor & Francis in New Zealand Journal of Botany

Vol. 31 (3) , 219-230
https://doi.org/10.1080/0028825x.1993.10419499

Abstract

Sequence comparison is used in molecular biology to detect and characterise the homology between two or more sequences. Many optimal alignment algorithms have been developed to produce the alignment with least overall cost. However, each of these methods depend upon the relative cost of a null being given a priori. This cost has usually been determined by simulation or Monte Carlo methods or chosen to give “biologically interesting” results. This paper outlines how lattice walks and generating functions could be used to find the expected number of matches in the optimal alignment of two sequences, in several special cases. Solving the resulting equations proves difficult.

Keywords

This publication has 19 references indexed in Scilit:

An improved algorithm for matching biological sequences
Published by Elsevier ,2004
An Extreme Value Theory for Sequence Matching
The Annals of Statistics, 1986
An Efron-Stein Inequality for Nonsymmetric Statistics
The Annals of Statistics, 1986
Efficient sequence alignment algorithms
Journal of Theoretical Biology, 1984
Some biological sequence metrics
Advances in Mathematics, 1976
Evolutionary origin of 5.8S ribosomal RNA
Nature, 1976
Longest common subsequences of two random sequences
Journal of Applied Probability, 1975
A test for nucleotide sequence homology
Journal of Molecular Biology, 1973
Shortcuts, diversions, and maximal chainsin partially ordered sets
Discrete Mathematics, 1973
A general method applicable to the search for similarities in the amino acid sequence of two proteins
Journal of Molecular Biology, 1970