Chance and Statistical Significance in Protein and DNA Sequence Analysis
- 3 July 1992
- journal article
- research article
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 257 (5066) , 39-49
- https://doi.org/10.1126/science.1621093
Abstract
Statistical approaches help in the determination of significant configurations in protein and nucleic acid sequence data. Three recent statistical methods are discussed: (i) score-based sequence analysis that provides a means for characterizing anomalies in local sequence text and for evaluating sequence comparisons; (ii) quantile distributions of amino acid usage that reveal general compositional biases in proteins and evolutionary relations; and (iii) r-scan statistics that can be applied to the analysis of spacings of sequence markers.Keywords
This publication has 66 references indexed in Scilit:
- Amino acid substitution matrices from an information theoretic perspectivePublished by Elsevier ,2005
- Suboptimal sequence alignment in molecular biologyJournal of Molecular Biology, 1991
- A new family of powerful multivariate statistical sequence analysis techniquesJournal of Molecular Biology, 1991
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Approximations and Bounds for the Distribution of the Scan StatisticJournal of the American Statistical Association, 1989
- A method to identify distinctive charge configurations in protein sequences, with application to human herpesvirus polypeptidesJournal of Molecular Biology, 1989
- Silent nucleotide substitutions and G+C content of some mitochondrial and bacterial genesJournal of Molecular Evolution, 1986
- A simple method for displaying the hydropathic character of a proteinJournal of Molecular Biology, 1982
- A denaturation map of the λ phage DNA molecule determined by electron microscopyJournal of Molecular Biology, 1966
- Determination of the base composition of deoxyribonucleic acid from its thermal denaturation temperatureJournal of Molecular Biology, 1962