Seeded Bayesian Networks: Constructing genetic networks from microarray data
Open Access
- 4 July 2008
- journal article
- research article
- Published by Springer Nature in BMC Systems Biology
- Vol. 2 (1) , 1-13
- https://doi.org/10.1186/1752-0509-2-57
Abstract
DNA microarrays and other genomics-inspired technologies provide large datasets that often include hidden patterns of correlation between genes reflecting the complex processes that underlie cellular metabolism and physiology. The challenge in analyzing large-scale expression data has been to extract biologically meaningful inferences regarding these processes – often represented as networks – in an environment where the datasets are often imperfect and biological noise can obscure the actual signal. Although many techniques have been developed in an attempt to address these issues, to date their ability to extract meaningful and predictive network relationships has been limited. Here we describe a method that draws on prior information about gene-gene interactions to infer biologically relevant pathways from microarray data. Our approach consists of using preliminary networks derived from the literature and/or protein-protein interaction data as seeds for a Bayesian network analysis of microarray results. Through a bootstrap analysis of gene expression data derived from a number of leukemia studies, we demonstrate that seeded Bayesian Networks have the ability to identify high-confidence gene-gene interactions which can then be validated by comparison to other sources of pathway data. The use of network seeds greatly improves the ability of Bayesian Network analysis to learn gene interaction networks from gene expression data. We demonstrate that the use of seeds derived from the biomedical literature or high-throughput protein-protein interaction data, or the combination, provides improvement over a standard Bayesian Network analysis, allowing networks involving dynamic processes to be deduced from the static snapshots of biological systems that represent the most common source of microarray data. Software implementing these methods has been included in the widely used TM4 microarray analysis package.Keywords
This publication has 36 references indexed in Scilit:
- A Framework for Elucidating Regulatory Networks Based on Prior Information and Expression DataAnnals of the New York Academy of Sciences, 2007
- Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior KnowledgeStatistical Applications in Genetics and Molecular Biology, 2007
- [9] TM4 Microarray Software SuitePublished by Elsevier ,2006
- Towards a proteome-scale map of the human protein–protein interaction networkNature, 2005
- Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction NetworksGenome Research, 2003
- Classification of pediatric acute lymphoblastic leukemia by gene expression profilingBlood, 2003
- The KEGG DatabasePublished by Wiley ,2002
- Using Bayesian Networks to Analyze Expression DataJournal of Computational Biology, 2000
- No free lunch theorems for optimizationIEEE Transactions on Evolutionary Computation, 1997
- Introduction to AlgorithmsJournal of the Operational Research Society, 1991