A Bayesian framework for combining heterogeneous data sources for gene function prediction (inSaccharomyces cerevisiae)
Top Cited Papers
- 25 June 2003
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 100 (14) , 8348-8353
- https://doi.org/10.1073/pnas.0832373100
Abstract
Genomic sequencing is no longer a novelty, but gene function annotation remains a key challenge in modern biology. A variety of functional genomics experimental techniques are available, from classic methods such as affinity precipitation to advanced high-throughput techniques such as gene expression microarrays. In the future, more disparate methods will be developed, further increasing the need for integrated computational analysis of data generated by these studies. We address this problem withmagic(Multisource Association of Genes by Integration of Clusters), a general framework that uses formal Bayesian reasoning to integrate heterogeneous types of high-throughput biological data (such as large-scale two-hybrid screens and multiple microarray analyses) for accurate gene function prediction. The system formally incorporates expert knowledge about relative accuracies of data sources to combine them within a normative framework.magicprovides a belief level with its output that allows the user to vary the stringency of predictions. We appliedmagictoSaccharomyces cerevisiaegenetic and physical interactions, microarray, and transcription factor binding sites data and assessed the biological relevance of gene groupings using Gene Ontology annotations produced by theSaccaromycesGenome Database. We found that by creating functional groupings based on heterogeneous data types,magicimproved accuracy of the groupings compared with microarray analysis alone. We describe several of the biological gene groupings identified.Keywords
This publication has 31 references indexed in Scilit:
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- Using Text Analysis to Identify Functionally Coherent Gene GroupsGenome Research, 2002
- Analyzing yeast protein–protein interaction data obtained from different sourcesNature Biotechnology, 2002
- Comparative assessment of large-scale data sets of protein–protein interactionsNature, 2002
- Learning Gene Functional Classifications from Multiple Data TypesJournal of Computational Biology, 2002
- A role for Rad23 proteins in 26S proteasome-dependent protein degradation?Mutation Research - Fundamental and Molecular Mechanisms of Mutagenesis, 2002
- Using Bayesian Networks to Analyze Expression DataJournal of Computational Biology, 2000
- Detecting Protein Function and Protein-Protein Interactions from Genome SequencesScience, 1999
- A novel genetic system to detect protein–protein interactionsNature, 1989
- Affinity precipitation of enzymesFEBS Letters, 1979