Targeting novel folds for structural genomics
- 3 May 2002
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 48 (1) , 44-52
- https://doi.org/10.1002/prot.10129
Abstract
The ultimate goal of structural genomics is to obtain the structure of each protein coded by each gene within a genome to determine gene function. Because of cost and time limitations, it remains impractical to solve the structure for every gene product experimentally. Up to a point, reasonably accurate three‐dimensional structures can be deduced for proteins with homologous sequences by using comparative modeling. Beyond this, fold recognition or threading methods can be used for proteins showing little homology to any known fold, although this is relatively time‐consuming and limited by the library of template folds currently available. Therefore, it is appropriate to develop methods that can increase our knowledge base, expanding our fold libraries by earmarking potentially “novel” folds for experimental structure determination. How can we sift through proteomic data rapidly and yet reliably identify novel folds as targets for structural genomics? We have analyzed a number of simple methods that discriminate between “novel” and “known” folds. We propose that simple alignments of secondary structure elements using predicted secondary structure could potentially be a more selective method than both a simple fold recognition method (GenTHREADER) and standard sequence alignment at finding novel folds when sequences show no detectable homology to proteins with known structures. Proteins 2002;48:44–52.Keywords
This publication has 24 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- What are the baselines for protein fold recognition?Bioinformatics, 2001
- Estimating the number of protein folds and families from complete genome data 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- Enhanced genome annotation using structural profiles in the program 3D-PSSM 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von HeijneJournal of Molecular Biology, 1999
- GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequencesJournal of Molecular Biology, 1999
- 100,000 protein structures for the biologistNature Structural & Molecular Biology, 1998
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970