GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function
Top Cited Papers
Open Access
- 27 June 2008
- journal article
- method
- Published by Springer Nature in Genome Biology
- Vol. 9 (S1) , 1-15
- https://doi.org/10.1186/gb-2008-9-s1-s4
Abstract
Background: Most successful computational approaches for protein function prediction integrate multiple genomics and proteomics data sources to make inferences about the function of unknown proteins. The most accurate of these algorithms have long running times, making them unsuitable for real-time protein function prediction in large genomes. As a result, the predictions of these algorithms are stored in static databases that can easily become outdated. We propose a new algorithm, GeneMANIA, that is as accurate as the leading methods, while capable of predicting protein function in real-time. Results: We use a fast heuristic algorithm, derived from ridge regression, to integrate multiple functional association networks and predict gene function from a single process-specific network using label propagation. Our algorithm is efficient enough to be deployed on a modern webserver and is as accurate as, or more so than, the leading methods on the MouseFunc I benchmark and a new yeast function prediction benchmark; it is robust to redundant and irrelevant data and requires, on average, less than ten seconds of computation time on tasks from these benchmarks. Conclusion: GeneMANIA is fast enough to predict gene function on-the-fly while achieving state-of-the-art accuracy. A prototype version of a GeneMANIA-based webserver is available at http://morrislab.med.utoronto.ca/prototype.Keywords
This publication has 31 references indexed in Scilit:
- A critical assessment of Mus musculusgene function prediction using integrated genomic evidenceGenome Biology, 2008
- Network‐based prediction of protein functionMolecular Systems Biology, 2007
- STRING 7--recent developments in the integration and prediction of protein interactionsNucleic Acids Research, 2006
- Support vector machine learning from heterogeneous data: an empirical analysis using protein sequence and structureBioinformatics, 2006
- Fast protein classification with multiple networksBioinformatics, 2005
- Whole-proteome prediction of protein function via graph-theoretic analysis of interaction mapsBioinformatics, 2005
- An Integrated Probabilistic Model for Functional Prediction of ProteinsJournal of Computational Biology, 2004
- Global protein function prediction from protein-protein interaction networksNature Biotechnology, 2003
- Comparative assessment of large-scale data sets of protein–protein interactionsNature, 2002
- The relationship between protein structure and function: a comprehensive survey with application to the yeast genomeJournal of Molecular Biology, 1999