Improving protein function prediction methods with integrated literature data
Open Access
- 15 April 2008
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 9 (1) , 198
- https://doi.org/10.1186/1471-2105-9-198
Abstract
Determining the function of uncharacterized proteins is a major challenge in the post-genomic era due to the problem's complexity and scale. Identifying a protein's function contributes to an understanding of its role in the involved pathways, its suitability as a drug target, and its potential for protein modifications. Several graph-theoretic approaches predict unidentified functions of proteins by using the functional annotations of better-characterized proteins in protein-protein interaction networks. We systematically consider the use of literature co-occurrence data, introduce a new method for quantifying the reliability of co-occurrence and test how performance differs across species. We also quantify changes in performance as the prediction algorithms annotate with increased specificity.Keywords
This publication has 33 references indexed in Scilit:
- Cross-species cluster co-conservation: a new method for generating protein interaction networksGenome Biology, 2007
- Biomedical Language Processing: What's Beyond PubMed?Molecular Cell, 2006
- Beyond annotation transfer by homology: novel protein-function prediction methods to assist drug discoveryDrug Discovery Today, 2005
- Towards a proteome-scale map of the human protein–protein interaction networkNature, 2005
- Fast protein classification with multiple networksBioinformatics, 2005
- Whole-proteome prediction of protein function via graph-theoretic analysis of interaction mapsBioinformatics, 2005
- GENETAG: a tagged corpus for gene/protein named entity recognitionBMC Bioinformatics, 2005
- A gene network for navigating the literatureNature Genetics, 2004
- Text mining: Generating hypotheses from MEDLINEJournal of the American Society for Information Science and Technology, 2003
- Global protein function prediction from protein-protein interaction networksNature Biotechnology, 2003