Testing Homogeneity in a Mixture Distribution via theL²Distance Between Competing Models

1 June 2004

journal article
Published by Taylor & Francis in Journal of the American Statistical Association

Vol. 99 (466) , 488-498
https://doi.org/10.1198/016214504000000494

Abstract

Ascertaining the number of components in a mixture distribution is an interesting and challenging problem for statisticians. Chen, Chen, and Kalbfleisch recently proposed a modified likelihood ratio test (MLRT), which is distribution-free and locally most powerful, asymptotically. In this article we present a new method for testing whether a finite mixture distribution is homogeneous. Our method, the D test, is based on the L² distance between a fitted homogeneous model and a fitted heterogeneous model. For mixture components from standard parametric families, the D-test statistic has a closed-form expression in terms of parameter estimators, whereas likelihood ratio-type test statistics do not; the latter test statistics are nontrivial functions of both the parameter estimators and the full dataset. The convergence rates of the D-test statistic under a null hypothesis of homogeneity and an alternative hypothesis of heterogeneity are established. The D test is shown to be competitive with the MLRT when the mixture components come from a normal location family. However, in the exponential scale and normal location/scale cases, the relative performances of the D test and the MLRT are mixed. In cases such as these two, we propose to use a weighted D test, in which the measure underlying the L² distance is changed to accentuate the disparities between the homogeneous and heterogeneous models. Changing the measure is equivalent to computing the D-test statistic using a weighting function or to transforming the data before conducting the D test. Appropriately weighted D tests are competitive in both the exponential scale and normal location/scale cases. After applying the D test to a dataset in which the observations are measurements of firms' financial performances, we conclude with discussion and remarks.

Keywords

This publication has 3 references indexed in Scilit:

Testing the number of components in a normal mixture
Biometrika, 2001
Tail Probabilities of the Maxima of Gaussian Random Fields
The Annals of Probability, 1993
Discrete Parameter Variation: Efficient Estimation of a Switching Regression Model
Econometrica, 1978

Testing Homogeneity in a Mixture Distribution via theL2Distance Between Competing Models

Abstract

Keywords

Testing Homogeneity in a Mixture Distribution via theL²Distance Between Competing Models