A cautionary note about assessing the fit of logistic regression models

Abstract
Logistic regression is a popular method of relating a binary response to one or more potential covariables or risk factors. In 1980, Hosmer and Lemeshow proposed a method for assessing the goodness of fit of logistic regression models. This test is based on a chi-squared statistic that compares the observed and expected cell frequencies in the 2 g table, as found by sorting the observations by predicted probabilities and forming g groups. We have noted that the test may be sensitive to situations where there are low expected cell frequencies. Further, several commonly used statistical packages apply the Hosmer-Lemeshow test, but do so in diff erent ways, and none of the packages we considered alerted the user to the potential difficulty with low expected cell frequencies. An alternative goodness-of-fit test is illustrated which seems to off er an advantage over the popular Hosmer-Lemeshow test, by reducing the likelihood of small expected counts and, potentially, sharpening the interpretation. An example is provided which demonstrates these ideas.

This publication has 5 references indexed in Scilit: