Some Cautionary Notes on the Use of Principal Components Regression
- 1 February 1998
- journal article
- research article
- Published by Taylor & Francis in The American Statistician
- Vol. 52 (1) , 15-19
- https://doi.org/10.1080/00031305.1998.10480530
Abstract
Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known data sets. We also illustrate by theory and example that it is possible for the PCR to fail miserably in the sense that when the response variable is regressed on all of the p principal components (PCs), the first (p − 1) PCs contribute nothing toward the reduction of the residual sum of squares, yet the last PC alone (the one that is always discarded according to PCR methodology) contributes everything. We then give conditions under which the PCR totally fails in the above sense.Keywords
This publication has 14 references indexed in Scilit:
- Variable selection in regression models using principal componentsCommunications in Statistics - Theory and Methods, 1994
- A Use's Guide to Principal ComponentsPublished by Wiley ,1991
- A Note on the Use of Principal Components in RegressionJournal of the Royal Statistical Society Series C: Applied Statistics, 1982
- Biased Estimation in Regression: An Evaluation Using Mean Squared ErrorJournal of the American Statistical Association, 1977
- The Acceptability of Regression Solutions: Another Look at Computational AccuracyJournal of the American Statistical Association, 1976
- On the Investigation of Alternative Regressions by Principal Component AnalysisJournal of the Royal Statistical Society Series C: Applied Statistics, 1973
- Discarding Variables in a Principal Component Analysis. II: Real DataJournal of the Royal Statistical Society Series C: Applied Statistics, 1973
- Discarding Variables in a Principal Component Analysis. I: Artificial DataJournal of the Royal Statistical Society Series C: Applied Statistics, 1972
- Two Case Studies in the Application of Principal Component AnalysisJournal of the Royal Statistical Society Series C: Applied Statistics, 1967
- THE RELATIONS OF THE NEWER MULTIVARIATE STATISTICAL METHODS TO FACTOR ANALYSISBritish Journal of Statistical Psychology, 1957