Genome Holography: Deciphering Function-Form Motifs from Gene Expression Data
Open Access
- 16 July 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 3 (7) , e2708
- https://doi.org/10.1371/journal.pone.0002708
Abstract
DNA chips allow simultaneous measurements of genome-wide response of thousands of genes, i.e. system level monitoring of the gene-network activity. Advanced analysis methods have been developed to extract meaningful information from the vast amount of raw gene-expression data obtained from the microarray measurements. These methods usually aimed to distinguish between groups of subjects (e.g., cancer patients vs. healthy subjects) or identifying marker genes that help to distinguish between those groups. We assumed that motifs related to the internal structure of operons and gene-networks regulation are also embedded in microarray and can be deciphered by using proper analysis. The analysis presented here is based on investigating the gene-gene correlations. We analyze a database of gene expression of Bacillus subtilis exposed to sub-lethal levels of 37 different antibiotics. Using unsupervised analysis (dendrogram) of the matrix of normalized gene-gene correlations, we identified the operons as they form distinct clusters of genes in the sorted correlation matrix. Applying dimension-reduction algorithm (Principal Component Analysis, PCA) to the matrices of normalized correlations reveals functional motifs. The genes are placed in a reduced 3-dimensional space of the three leading PCA eigen-vectors according to their corresponding eigen-values. We found that the organization of the genes in the reduced PCA space recovers motifs of the operon internal structure, such as the order of the genes along the genome, gene separation by non-coding segments, and translational start and end regions. In addition to the intra-operon structure, it is also possible to predict inter-operon relationships, operons sharing functional regulation factors, and more. In particular, we demonstrate the above in the context of the competence and sporulation pathways. We demonstrated that by analyzing gene-gene correlation from gene-expression data it is possible to identify operons and to predict unknown internal structure of operons and gene-networks regulation.Keywords
This publication has 45 references indexed in Scilit:
- DBTBS: a database of transcriptional regulation in Bacillus subtilis containing upstream intergenic conservation informationNucleic Acids Research, 2007
- A framework for significance analysis of gene expression data using dimension reduction methodsBMC Bioinformatics, 2007
- Evaluation of gene-expression clustering via mutual information distance measureBMC Bioinformatics, 2007
- A supervised approach for identifying discriminating genotype patterns and its application to breast cancer dataBioinformatics, 2007
- Structure of the Nucleotide Complex of PyrR, the pyr Attenuation Protein from Bacillus caldolyticus , Suggests Dual Regulation by Pyrimidine and Purine NucleotidesJournal of Bacteriology, 2005
- Functional holography of complex networks activity—From cultures to the human brainComplexity, 2005
- RNA Expression Analysis Using an Antisense Bacillus subtilis Genome ArrayJournal of Bacteriology, 2001
- A clustering algorithm based on graph connectivityInformation Processing Letters, 2000
- Coupled two-way clustering analysis of gene microarray dataProceedings of the National Academy of Sciences, 2000
- The transcriptional organization of theBacillus subtilis168 chromosome region between thespoVAF and serAgenetic lociMolecular Microbiology, 1993