GTOP: a database of protein structures predicted from genome sequences
Open Access
- 1 January 2002
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 30 (1) , 294-298
- https://doi.org/10.1093/nar/30.1.294
Abstract
Large-scale genome projects generate an unprecedented number of protein sequences, most of them are experimentally uncharacterized. Predicting the 3D structures of sequences provides important clues as to their functions. We constructed the Genomes TO Protein structures and functions (GTOP) database, containing protein fold predictions of a huge number of sequences. Predictions are mainly carried out with the homology search program PSI-BLAST, currently the most popular among high-sensitivity profile search methods. GTOP also includes the results of other analyses, e.g. homology and motif search, detection of transmembrane helices and repetitive sequences. We have completed analyzing the sequences of 41 organisms, with the number of proteins exceeding 120 000 in total. GTOP uses a graphical viewer to present the analytical results of each ORF in one page in a ‘color-bar’ format. The assigned 3D structures are presented by Chime plug-in or RasMol. The binding sites of ligands are also included, providing functional information. The GTOP server is available at http://spock.genes.nig.ac.jp/~genome/gtop.html.Keywords
This publication has 24 references indexed in Scilit:
- Functional and structural genomics using PEDANTBioinformatics, 2001
- The COG database: new developments in phylogenetic classification of proteins from complete genomesNucleic Acids Research, 2001
- Structural/functional assignment of unknown bacteriophage T4 proteins by iterative database searchesGene, 2000
- An overview of structural genomics.Nature Structural & Molecular Biology, 2000
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- The Pfam Protein Families DatabaseNucleic Acids Research, 2000
- MODBASE, a database of annotated comparative protein structure modelsNucleic Acids Research, 2000
- HUGE: a database for human large proteins identified in the Kazusa cDNA sequencing projectNucleic Acids Research, 2000
- Automated genome sequence analysis and annotation.Bioinformatics, 1999
- Improved tools for biological sequence comparison.Proceedings of the National Academy of Sciences, 1988