A Graphical Technique for Determining the Number of Components in a Mixture of Normals

1 June 1994

journal article
research article
Published by JSTOR in Journal of the American Statistical Association

Vol. 89 (426) , 487
https://doi.org/10.2307/2290850

Abstract

When a population is assumed to be composed of a finite number of subpopulations, a natural model to choose is the finite mixture model. It will often be the case, however, that the number of component distributions is unknown and must be estimated. This problem can be difficult; for instance, the density of two mixed normals is not bimodal unless the means are separated by at least 2 standard deviations. Hence modality of the data per se can be an insensitive approach to component estimation. We demonstrate that a mixture of two normals divided by a normal density having the same mean and variance as the mixed density is always bimodal. This analytic result and other related results form the basis for a diagnostic and a test for the number of components in a mixture of normals. The density is estimated using a kernel density estimator. Under the null hypothesis, the proposed diagnostic can be approximated by a stationary Gaussian process. Under the alternative hypothesis, components in the mixture will express themselves as major modes in the diagnostic plot. A test for mixing is based on the amount of smoothing necessary to suppress these large deviations from a Gaussian process.

Keywords

This publication has 0 references indexed in Scilit: