Quantitative Trait Associated Microarray Gene Expression Data Analysis
Open Access
- 26 May 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 23 (8) , 1558-1573
- https://doi.org/10.1093/molbev/msl019
Abstract
Selection on phenotypes may cause genetic change. To understand the relationship between phenotype and gene expression from an evolutionary viewpoint, it is important to study the concordance between gene expression and profiles of phenotypes. In this study, we use a novel method of clustering to identify genes whose expression profiles are related to a quantitative phenotype. Cluster analysis of gene expression data aims at classifying genes into several different groups based on the similarity of their expression profiles across multiple conditions. The hope is that genes that are classified into the same clusters may share underlying regulatory elements or may be a part of the same metabolic pathways. Current methods for examining the association between phenotype and gene expression are limited to linear association measured by the correlation between individual gene expression values and phenotype. Genes may be associated with the phenotype in a nonlinear fashion. In addition, groups of genes that share a particular pattern in their relationship to phenotype may be of evolutionary interest. In this study, we develop a method to group genes based on orthogonal polynomials under a multivariate Gaussian mixture model. The effect of each expressed gene on the phenotype is partitioned into a cluster mean and a random deviation from the mean. Genes can also be clustered based on a time series. Parameters are estimated using the expectation–maximization algorithm and implemented in SAS. The method is verified with simulated data and demonstrated with experimental data from 2 studies, one clusters with respect to severity of disease in Alzheimer's patients and another clusters data for a rat fracture healing study over time. We find significant evidence of nonlinear associations in both studies and successfully describe these patterns with our method. We give detailed instructions and provide a working program that allows others to directly implement this method in their own analyses.Keywords
This publication has 38 references indexed in Scilit:
- Identification of co-regulated transcripts affecting male body size in DrosophilaGenome Biology, 2005
- Incipient Alzheimer's disease: Microarray correlation analyses reveal major transcriptional and tumor suppressor responsesProceedings of the National Academy of Sciences, 2004
- PCA disjoint models for multiclass cancer analysis using gene expression dataBioinformatics, 2003
- Composition and dynamics of theCaenorhabditis elegansearly embryonic transcriptomeDevelopment, 2003
- Gene Expression During the Life Cycle of Drosophila melanogasterScience, 2002
- Model-Based Clustering, Discriminant Analysis, and Density EstimationJournal of the American Statistical Association, 2002
- A Bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changesBioinformatics, 2001
- MCLUST: Software for Model-Based Cluster AnalysisJournal of Classification, 1999
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences, 1998
- A Practical Guide to SplinesPublished by Springer Nature ,1978