Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences
- 1 January 1988
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 4 (2) , 99-122
- https://doi.org/10.1002/prot.340040204
Abstract
We have examined oligopeptides with lengths ranging from 2 to 11 residues in protein sequences that show no obvious evolutionary relationship. All sequences in the Protein Identification Resource database were carefully classified by sensitive homology searches into superfamilies to obtain unbiased oligopeptide counts. The results, contrary to previous studies, show clear prejudices in protein sequences. The oligopeptide preferences were used to help decide the significance of sequence homologies and to improve the more general methods for detecting protein coding regions within nucleotide sequences.Keywords
This publication has 27 references indexed in Scilit:
- Determinants of a protein foldJournal of Molecular Biology, 1987
- A sensitive procedure to compare amino acid sequencesJournal of Molecular Biology, 1987
- Rapid and Sensitive Protein Similarity SearchesScience, 1985
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Similar Amino Acid Sequences: Chance or Common Ancestry?Science, 1981
- A discussion of the solution for the best rotation to relate two sets of vectorsActa Crystallographica Section A, 1978
- The independent distribution of amino acid near neighbor pairs into polypeptidesBiochemical and Biophysical Research Communications, 1977
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- A solution for the best rotation to relate two sets of vectorsActa Crystallographica Section A, 1976