Inferring pathways from gene lists using a literature-derived network of biological relationships
Open Access
- 27 October 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (6) , 788-793
- https://doi.org/10.1093/bioinformatics/bti069
Abstract
Motivation: A number of omic technologies such as transcriptional profiling, proteomics, literature searches, genetic association, etc. help in the identification of sets of important genes. A subset of these genes may act in a coordinated manner, possibly because they are part of the same biological pathway. Interpreting such gene lists and relating them to pathways is a challenging task. Databases of biological relationships between thousands of mammalian genes can help in deciphering omics data. The relationships between genes can be assembled into a biological network with each protein as a node and each relationship as an edge between two proteins (or nodes). This network may then be searched for subnetworks consisting largely of interesting genes from the omics experiment. The subset of genes in the subnetwork along with the web of relationships between them helps to decipher the underlying pathways. Finding such subnetworks that maximally include all proteins from the query set but few others is the focus for this paper. Results: We present a heuristic algorithm and a scoring function that work well both on simulated data and on data from known pathways. The scoring function is an extension of a previous study for a single biological experiment. We use a simple set of heuristics that provide a more efficient solution than the simulated annealing method. We find that our method works on reasonably complex curated networks containing ∼9000 biological entities (genes and metabolites), and ∼30 000 biological relationships. We also show that our method can pick up a pathway signal from a query list including a moderate number of genes unrelated to the pathway. In addition, we quantify the sensitivity and specificity of the technique. Contact:dilip_rajagopalan@gsk.comKeywords
This publication has 10 references indexed in Scilit:
- Network biology: understanding the cell's functional organizationNature Reviews Genetics, 2004
- Identifying biological themes within lists of genes with EASEGenome Biology, 2003
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- Identifying biological themes within lists of genes with EASEGenome Biology, 2003
- Global functional profiling of gene expression☆☆This work was funded in part by a Sun Microsystems grant awarded to S.D., NIH Grant HD36512 to S.A.K., a Wayne State University SOM Dean’s Post-Doctoral Fellowship, and an NICHD Contraception and Infertility Loan to G.C.O. Support from the WSU MCBI mode is gratefully appreciated.Genomics, 2003
- TRANSFAC(R): transcriptional regulation, from patterns to profilesNucleic Acids Research, 2003
- Discovering regulatory and signalling circuits in molecular interaction networksBioinformatics, 2002
- GenMAPP, a new tool for viewing and analyzing microarray data on biological pathwaysNature Genetics, 2002
- A literature network of human genes for high-throughput analysis of gene expressionNature Genetics, 2001
- Systematic determination of genetic network architectureNature Genetics, 1999