An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data
Open Access
- 1 January 1998
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 26 (1) , 327-331
- https://doi.org/10.1093/nar/26.1.327
Abstract
We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship.Keywords
This publication has 13 references indexed in Scilit:
- Protein secondary structural types are differentially coded on messenger RNAProtein Science, 1996
- Ribosome‐mediated translational pause and protein domain organizationProtein Science, 1996
- Protein structure and the sequential structure of mRNA: α-Helix and β-sheet signals at the nucleotide levelProteins-Structure Function and Bioinformatics, 1996
- Condon usage tabulated from the international DNA sequence databasesNucleic Acids Research, 1996
- Left-handed Polyproline II Helices Commonly Occur in Globular ProteinsJournal of Molecular Biology, 1993
- Nonuniform size distribution of nascent globin peptides, evidence for pause localization sites, and a cotranslational protein-folding modelProtein Journal, 1991
- Codon usage tabulated from the GenBank genetic sequence dataNucleic Acids Research, 1990
- [Role of the code redundancy in determining cotranslational protein folding].1989
- CODON USE FREQUENCIES IN MESSENGER-RNA AND THE CODING OF PROTEIN DOMAIN-STRUCTURE1989
- Codon usage in yeast: cluster analysis clearly differentiates highly and lowly expressed genesNucleic Acids Research, 1986