Variable Selection in Penalized Model‐Based Clustering Via Regularization on Grouped Parameters
- 20 December 2007
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 64 (3) , 921-930
- https://doi.org/10.1111/j.1541-0420.2007.00955.x
Abstract
SummaryPenalized model‐based clustering has been proposed for high‐dimensional but small sample‐sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventionalL1penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches.Keywords
This publication has 26 references indexed in Scilit:
- Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variablesElectronic Journal of Statistics, 2008
- On the “degrees of freedom” of the lassoThe Annals of Statistics, 2007
- Improved centroids estimation for the nearest shrunken centroid classifierBioinformatics, 2007
- Semi-supervised learning via penalized mixture model with application to microarray sample classificationBioinformatics, 2006
- Variable Selection for Model-Based ClusteringJournal of the American Statistical Association, 2006
- Bayesian Variable Selection in Clustering High-Dimensional DataJournal of the American Statistical Association, 2005
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- On the Convergence Properties of the EM AlgorithmThe Annals of Statistics, 1983
- Objective Criteria for the Evaluation of Clustering MethodsJournal of the American Statistical Association, 1971