Protein networks markedly improve prediction of subcellular localization in multiple eukaryotic species
Open Access
- 4 October 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 36 (20) , e136
- https://doi.org/10.1093/nar/gkn619
Abstract
The function of a protein is intimately tied to its subcellular localization. Although localizations have been measured for many yeast proteins through systematic GFP fusions, similar studies in other branches of life are still forthcoming. In the interim, various machine-learning methods have been proposed to predict localization using physical characteristics of a protein, such as amino acid content, hydrophobicity, side-chain mass and domain composition. However, there has been comparatively little work on predicting localization using protein networks. Here, we predict protein localizations by integrating an extensive set of protein physical characteristics over a protein's extended protein–protein interaction neighborhood, using a classification framework called ‘Divide and Conquer k-Nearest Neighbors’ (DC-kNN). These predictions achieve significantly higher accuracy than two well-known methods for predicting protein localization in yeast. Using new GFP imaging experiments, we show that the network-based approach can extend and revise previous annotations made from high-throughput studies. Finally, we show that our approach remains highly predictive in higher eukaryotes such as fly and human, in which most localizations are unknown and the protein network coverage is less substantial.Keywords
This publication has 55 references indexed in Scilit:
- PLPD: reliable protein localization prediction from imbalanced and overlapped datasetsNucleic Acids Research, 2006
- ORFeome cloning and global analysis of protein localization in the fission yeast Schizosaccharomyces pombeNature Biotechnology, 2006
- Refining Protein Subcellular LocalizationPLoS Computational Biology, 2005
- Predicting protein localization in budding YeastBioinformatics, 2004
- ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLASTNucleic Acids Research, 2004
- Global analysis of protein localization in budding yeastNature, 2003
- Subcellular localization of the yeast proteomeGenes & Development, 2002
- A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome 1 1Edited by F. CohenJournal of Molecular Biology, 2000
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- Large-scale analysis of the yeast genome by transposon tagging and gene disruptionNature, 1999