A Comparison of Item- and Person-Fit Methods of Assessing Model-Data Fit in IRT
- 1 June 1990
- journal article
- Published by SAGE Publications in Applied Psychological Measurement
- Vol. 14 (2) , 127-137
- https://doi.org/10.1177/014662169001400202
Abstract
Many item-fit statistics have been proposed for assessing whether the responses to test items ag gregated across examinees conform to IRT test models. Conversely, person-fit statistics have been proposed for assessing whether an examinee's re sponses aggregated across items are congruent with a specified IRT model. Statistical procedures to as sess item fit have differed from those to assess per son fit. This research compared a χ 2 item-fit index with a likelihood-based person-fit index. Eight 0,1 data matrices were simulated under the three- parameter logistic test model. Both the likelihood- based and χ2 fit statistics were then computed for examinees and items, and Type I and Type II error rates were analyzed. With data simulated to fit the IRT model, the χ 2 test overidentified examinees and items as being misfitting, while the likelihood- based fit index held closer to the specified α levels. The two fit indices gave consistent (mis)fit-to- model results in 94 and 97 percent of cases for items and examinees, respectively, across simula tions. Under simulated conditions of data misfit, the χ2 statistic detected misfit at a higher rate than the likelihood-based statistic, indicating that the χ2 statistic was slightly more sensitive to response pat tern aberrancy. However, other considerations led to a recommendation for employing the likelihood- based index in applied fit analyses to evaluate both examinee and item model-data (mis)fit.Keywords
This publication has 18 references indexed in Scilit:
- The Analysis of Item-Ability Regressions: An Exploratory IRT Model Fit ToolApplied Psychological Measurement, 1985
- Appropriateness measurement with polychotomous item response models and standardized indicesBritish Journal of Mathematical and Statistical Psychology, 1985
- Likert Scaling Using the Graded Response Latent Trait ModelApplied Psychological Measurement, 1983
- Choice of Test Model for Appropriateness MeasurementApplied Psychological Measurement, 1982
- Appropriateness measurement: Review, critique and validating studiesBritish Journal of Mathematical and Statistical Psychology, 1982
- ANALYSIS OF ITEM RESPONSE PATTERNS. QUESTIONABLE TEST DATA AND DISSIMILAR CURRICULUM PRACTICESJournal of Educational Measurement, 1981
- Measuring the Appropriateness of Multiple-Choice Test ScoresJournal of Educational Statistics, 1979
- Tests are perfectly reliableBritish Journal of Mathematical and Statistical Psychology, 1978
- A Goodness of Fit Test for the Rasch ModelPsychometrika, 1973
- Estimating Item Parameters and Latent Ability when Responses are Scored in Two or More Nominal CategoriesPsychometrika, 1972