Incorporating Predictor Network in Penalized Regression with Application to Microarray Data
- 1 June 2010
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 66 (2) , 474-484
- https://doi.org/10.1111/j.1541-0420.2009.01296.x
Abstract
Summary We consider penalized linear regression, especially for “large p, small n” problems, for which the relationships among predictors are described a priori by a network. A class of motivating examples includes modeling a phenotype through gene expression profiles while accounting for coordinated functioning of genes in the form of biological pathways or networks. To incorporate the prior knowledge of the similar effect sizes of neighboring predictors in a network, we propose a grouped penalty based on the Lγ‐norm that smoothes the regression coefficients of the predictors over the network. The main feature of the proposed method is its ability to automatically realize grouped variable selection and exploit grouping effects. We also discuss effects of the choices of the γ and some weights inside the Lγ‐norm. Simulation studies demonstrate the superior finite‐sample performance of the proposed method as compared to Lasso, elastic net, and a recently proposed network‐based method. The new method performs best in variable selection across all simulation set‐ups considered. For illustration, the method is applied to a microarray dataset to predict survival times for some glioblastoma patients using a gene expression dataset and a gene network compiled from some Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.Keywords
This publication has 27 references indexed in Scilit:
- Network-based multiple locus linkage analysis of expression traitsBioinformatics, 2009
- Network-based support vector machine for classification of microarray samplesBMC Bioinformatics, 2009
- Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCARBiometrics, 2008
- Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular targetProceedings of the National Academy of Sciences, 2006
- Regularization and Variable Selection Via the Elastic NetJournal of the Royal Statistical Society Series B: Statistical Methodology, 2005
- Classification of gene microarrays by penalized logistic regressionBiostatistics, 2004
- Exploration, normalization, and summaries of high density oligonucleotide array probe level dataBiostatistics, 2003
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- Ridge Regression: Biased Estimation for Nonorthogonal ProblemsTechnometrics, 1970
- Ridge Regression: Biased Estimation for Nonorthogonal ProblemsTechnometrics, 1970