Structural genomics is the largest contributor of novel structural leverage
Open Access
- 5 February 2009
- journal article
- research article
- Published by Springer Nature in Journal of Structural and Functional Genomics
- Vol. 10 (2) , 181-191
- https://doi.org/10.1007/s10969-008-9055-6
Abstract
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database.Keywords
This publication has 40 references indexed in Scilit:
- Gene3D: comprehensive structural and functional annotation of genomesNucleic Acids Research, 2007
- Data growth and its impact on the SCOP database: new developmentsNucleic Acids Research, 2007
- Critical assessment of methods of protein structure prediction—Round VIIProteins-Structure Function and Bioinformatics, 2007
- Assessing model accuracy using the homology modeling automatically softwareProteins-Structure Function and Bioinformatics, 2007
- Growth of novel protein structural dataProceedings of the National Academy of Sciences, 2007
- The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB dataNucleic Acids Research, 2006
- Community structure and metabolism through reconstruction of microbial genomes from the environmentNature, 2004
- UniProt: the Universal Protein knowledgebaseNucleic Acids Research, 2004
- Target Selection and Determination of Function in Structural GenomicsIUBMB Life, 2003
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997