Applications of beta-mixture models in bioinformatics
Open Access
- 15 February 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (9) , 2118-2122
- https://doi.org/10.1093/bioinformatics/bti318
Abstract
Summary: We propose a beta-mixture model approach to solve a variety of problems related to correlations of gene-expression levels. For example, in meta-analyses of microarray gene-expression datasets, a threshold value of correlation coefficients for gene-expression levels is used to decide whether gene-expression levels are strongly correlated across studies. Ad hoc threshold values such as 0.5 are often used. In this paper, we use a beta-mixture model approach to divide the correlation coefficients into several populations so that the large correlation coefficients can be identified. Another important application of the proposed method is in finding co-expressed genes. Two examples are provided to illustrate both applications. Through our analysis, we also discover that the popular model selection criteria BIC and AIC are not suitable for the beta-mixture model. To determine the number of components in the mixture model, we suggest an alternative criterion, ICL–BIC, which is shown to perform better in selecting the correct mixture model. Contact:yuanji@mdanderson.org Supplementary information:http://odin.mdacc.tmc.edu/~yuanj/highcorgeneanno.htmlKeywords
This publication has 7 references indexed in Scilit:
- A Cross-Study Comparison of Gene Expression Studies for the Molecular Classification of Lung CancerClinical Cancer Research, 2004
- Gene-expression profiles predict survival of patients with lung adenocarcinomaNature Medicine, 2002
- Analysis of matched mRNA measurements from two different microarray technologiesBioinformatics, 2002
- Practical Bayesian density estimation using mixtures of normalsJournal of the American Statistical Association, 1997
- Consistent Estimation of a Mixing DistributionThe Annals of Statistics, 1992
- Estimating the Dimension of a ModelThe Annals of Statistics, 1978
- Maximum Likelihood from Incomplete Data Via the EM AlgorithmJournal of the Royal Statistical Society Series B: Statistical Methodology, 1977