On the statistical significance of nucleic add similarities
- 1 January 1984
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 12 (1Part1) , 215-226
- https://doi.org/10.1093/nar/12.1part1.215
Abstract
When evaluating sequence similarities among nucleic acids by the usual methods, statistical significance is often found when the biological significance of the similarity is dubious. We demonstrate that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures. We propose a series of models which account for some of these known statistical properties. The utility of the method is demonstrated in evaluating high relative similarity scores in four specific cases in which there is little biological context by which to judge the similarities. In two of the cases we identify the statistical properties which are responsible for the apparent similarity. In the other two cases the statistical significance of the similarity persists even when the known statistical properties of sequences are modelled. For one of these cases biological significance is likely while the other case remains an enigma.Keywords
This publication has 13 references indexed in Scilit:
- Random sequencesJournal of Molecular Biology, 1983
- Recognition of protein coding regions in DNA sequencesNucleic Acids Research, 1982
- Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetriesNucleic Acids Research, 1982
- A + T-rich linkers define functional domains in eukaryotic DNANature, 1982
- Structure of the rat prolactin gene.Journal of Biological Chemistry, 1980
- Strong adenine clustering in nucleotide sequencesJournal of Theoretical Biology, 1980
- Codon frequencies in 119 individual genes confirm corsistent choices of degenerate bases according to genome typeNucleic Acids Research, 1980
- Codon catalog usage and the genome hypothesisNucleic Acids Research, 1980
- Some rules in the ordering of nucleotides in the DNANucleic Acids Research, 1980
- Computer analysis of nucleic acid regulatory sequences.Proceedings of the National Academy of Sciences, 1977