Robust Principal Components
- 1 January 1989
- journal article
- research article
- Published by Taylor & Francis in Communications in Statistics - Simulation and Computation
- Vol. 18 (3) , 857-874
- https://doi.org/10.1080/03610918908812795
Abstract
This paper proposes a new algorithm to obtain an eigenvalue decomposition for the sample covariance matrix of a multivariate dataset. The algorithm is based on the rotation technique employed by Ammann and Van Ness (1988a,b) to obtain a robust solution to an errors-in-variables problem. When this rotation technique is combined with an iterative reweighting of the data, a robust eigenvalue decomposition is obtained. This robust eigenvalue decomposition has important applications to principal component analysis. Monte Carlo simulations are performed to compare ordinary principal component analysis using the standard eigenvalue decomposition with this algorithm, referred to as ROPRC. It is seen that ROPRC is reasonably efficient compared to an eigenvalue decomposition when Gaussian data is available, and that ROPRC is much better than the eigenvalue decomposition if outliers are present or if the data has a heavy-tailed distribution. The algorithm returns useful numerical diagnostic information in the form of a matrix of weights that describes the importance of each observation in the determination of each of the principal components. These weights are used to obtain robust estimates of the eigenvalues and the underlying covariance structure of the data. An example is given to illustrate the use of ROPRC and to compare its results with standard principal component analysis.Keywords
This publication has 4 references indexed in Scilit:
- A routine for converting regression algorithms into corresponding orthogonal regression algorithmsACM Transactions on Mathematical Software, 1988
- Estimation of parameters in linear structural relationships: Sensitivity to the choice of the ratio of error variancesBiometrika, 1984
- The Influence Function in the Errors in Variables ProblemThe Annals of Statistics, 1984
- A System of Subroutines for Iteratively Reweighted Least Squares ComputationsACM Transactions on Mathematical Software, 1980