Optimizing the Classification Performance of Logistic Regression and Fisher'S Discriminant Analyses

Abstract
Logistic regression analysis (LRA) and Fisher's discriminant analysis (FDA) are two of the most popular methodologies for solving classification problems involving a dichotomous class variable and two or more attributes. Like other suboptimal classification methodologies, neither LRA nor FDA explicitly maximizes percentage accuracy in classification (PAC) for the training sample (the sample on which the model is based). A heuristic is described that shows early promise of increasing the PAC of suboptimal models. The heuristic involves refining the cutpoint used by the suboptimal model to classify observations. This refinement is accomplished by applying univariate optimal discriminant analysis (UniODA) to the predicted response function values obtained by using the suboptimal model for training data. The UniODA cutpoint, rather than the cutpoint of the suboptimal model, is then employed to classify both training and validity (hold-out) data. UniODA-refinement of LRA models is demonstrated by using 12 examples reflecting a variety of substantive areas, including psychology, education, geriatrics, medicine, biology, marketing, and geology. The mean PAC of UniODA- refined LRA was greater than that for nonrefined LRA in both training and validity analyses. UniODA-refmed LRA yielded greater validity PAC than nonrefined LRA in 6 of the 12 examples and lower validity PAC in only 1 of the 12 examples (exactly consistent results emerged for FDA). Discussion focuses on directions for future research.