Some Cautionary Notes on the Use of Principal Components Regression

Abstract
Many textbooks on regression analysis include the methodology of principal components regression (PCR) as a way of treating multicollinearity problems. Although we have not encountered any strong justification of the methodology, we have encountered, through carrying out the methodology in well-known data sets with severe multicollinearity, serious actual and potential pitfalls in the methodology. We address these pitfalls as cautionary notes, numerical examples that use well-known data sets. We also illustrate by theory and example that it is possible for the PCR to fail miserably in the sense that when the response variable is regressed on all of the p principal components (PCs), the first (p − 1) PCs contribute nothing toward the reduction of the residual sum of squares, yet the last PC alone (the one that is always discarded according to PCR methodology) contributes everything. We then give conditions under which the PCR totally fails in the above sense.

This publication has 14 references indexed in Scilit: