Regression methods for high dimensional multicollinear data
- 1 January 2000
- journal article
- research article
- Published by Taylor & Francis in Communications in Statistics - Simulation and Computation
- Vol. 29 (4) , 1021-1037
- https://doi.org/10.1080/03610910008813652
Abstract
To compare their performance on high dimensional data, several regression methods are applied to data sets in which the number of exploratory variables greatly exceeds the sample sizes. The methods are stepwise regression, principal components regression, two forms of latent root regression, partial least squares, and a new method developed here. The data are four sample sets for which near infrared reflectance spectra have been determined and the regression methods use the spectra to estimate the concentration of various chemical constituents, the latter having been determined by standard chemical analysis. Thirty-two regression equations are estimated using each method and their performances are evaluated using validation data sets. Although it is the most widely used, stepwise regression was decidedly poorer than the other methods considered. Differences between the latter were small with partial least squares performing slightly better than other methods under all criteria examined, albeit not by a statistically significant amount.Keywords
This publication has 8 references indexed in Scilit:
- Neural Networks and Related Methods for ClassificationJournal of the Royal Statistical Society Series B: Statistical Methodology, 1994
- An Interpretation of Partial Least SquaresJournal of the American Statistical Association, 1994
- A Statistical View of Some Chemometrics Regression ToolsTechnometrics, 1993
- PLS regression methodsJournal of Chemometrics, 1988
- On the structure of partial least squares regressionCommunications in Statistics - Simulation and Computation, 1988
- Principal Component AnalysisPublished by Springer Nature ,1986
- A multivariate calibration problem in analytical chemistry solved by partial least-squares models in latent variablesAnalytica Chimica Acta, 1983
- Cross-Validatory Choice and Assessment of Statistical PredictionsJournal of the Royal Statistical Society Series B: Statistical Methodology, 1974