Comparison of algorithms for replacing missing data in discriminant analysis

1 January 1992

journal article
research article
Published by Taylor & Francis in Communications in Statistics - Theory and Methods

Vol. 21 (6) , 1567-1578
https://doi.org/10.1080/03610929208830864

Abstract

We examined the impact of different methods for replacing missing data in discriminant analyses conducted on randomly generated samples from multivariate normal and non-normal distributions. The probabilities of correct classification were obtained for these discriminant analyses before and after randomly deleting data as well as after deleted data were replaced using: (1) variable means, (2) principal component projections, and (3) the EM algorithm. Populations compared were: (1) multivariate normal with covariance matrices ∑₁=∑₂, (2) multivariate normal with ∑₁≠∑₂ and (3) multivariate non-normal with ∑₁=∑₂. Differences in the probabilities of correct classification were most evident for populations with small Mahalanobis distances or high proportions of missing data. The three replacement methods performed similarly but all were better than non - replacement.

Keywords

This publication has 10 references indexed in Scilit:

On the Convergence Properties of the EM Algorithm
The Annals of Statistics, 1983
Maximum Likelihood from Incomplete Data Via the EM Algorithm
Journal of the Royal Statistical Society Series B: Statistical Methodology, 1977
Alternative Approaches to Missing Values in Discriminant Analysis
Journal of the American Statistical Association, 1976
The Treatment of Missing Values in Discriminant Analysis-1. The Sampling Experiment
Journal of the American Statistical Association, 1972
The Effect of Unequal Variance-Covariance Matrices on Fisher's Linear Discriminant Function
Published by JSTOR ,1969
On Expected Probabilities of Misclassification in Discriminant Analysis, Necessary Sample Size, and a Relation with the Multiple Correlation Coefficient
Published by JSTOR ,1968
Estimation of Error Rates in Discriminant Analysis
Technometrics, 1968
The Robustness of Hotelling's T 2
Journal of the American Statistical Association, 1967
Probabilities of Correct Classification in Discriminant Analysis
Published by JSTOR ,1966
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS
Annals of Eugenics, 1936