Outlier Detection in Multivariate Analytical Chemical Data
- 1 May 1998
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 70 (11) , 2372-2379
- https://doi.org/10.1021/ac970763d
Abstract
The unreliability of multivariate outlier detection techniques such as Mahalanobis distance and hat matrix leverage has been known in the statistical community for well over a decade. However, only within the past few years has a serious effort been made to introduce robust methods for the detection of multivariate outliers into the chemical literature. Techniques such as the minimum volume ellipsoid (MVE), multivariate trimming (MVT), and M-estimators (e.g., PROP), and others similar to them, such as the minimum covariance determinant (MCD), rely upon algorithms that are difficult to program and may require significant processing times. While MCD and MVE have been shown to be statistically sound, we found MVT unreliable due to the method's use of the Mahalanobis distance measure in its initial step. We examined the performance of MCD and MVT on selected data sets and in simulations and compared the results with two methods of our own devising. Both the proposed resampling by the half-means method and the smallest half-volume method are simple to use, are conceptually clear, and provide results superior to MVT and the current best-performing technique, MCD. Either proposed method is recommended for the detection of multiple outliers in multivariate data.Keywords
This publication has 35 references indexed in Scilit:
- Identification of Outliers in Multivariate DataJournal of the American Statistical Association, 1996
- Classification of Near-Infrared Spectra Using Wavelength Distances: Comparison to the Mahalanobis Distance and Residual Variance MethodsAnalytical Chemistry, 1995
- Fast Very Robust Methods for the Detection of Multiple OutliersJournal of the American Statistical Association, 1994
- Computable Robust Estimation of Multivariate Location and Shape in High Dimension Using Compound EstimatorsJournal of the American Statistical Association, 1994
- Unmasking Outliers and Leverage Points: A ConfirmationJournal of the American Statistical Association, 1993
- Least median of squares: a robust method for outlier and model error detection in regression and calibrationAnalytica Chimica Acta, 1986
- Unique-sample selection via near-infrared spectral subtractionAnalytical Chemistry, 1985
- Least Median of Squares RegressionJournal of the American Statistical Association, 1984
- Location of Several Outliers in Multiple-Regression Data Using Elemental SetsTechnometrics, 1984
- Robust Estimation of Dispersion Matrices and Principal ComponentsJournal of the American Statistical Association, 1981