A Comparison of Procedures to Detect Item Parameter Drift

Abstract
Monte carlo methods were used to compare several measures of item parameter drift. Number of examinees, number of items, and number of drift items in the test were manipulated. Overall, Lord's x2 measure was the most effective in identifying items that exhibited drift. However, the measure was accurate only when the studied item's c parameter was constrained to be equal across the two assessment years. Of the remaining measures, the best methods (a z test based on Raju's exact unsigned integral, the NAEP BILOG/PARSCALE computer program's x2 by subgroup, and Kim and Cohen's closed-interval signed-area measure) required empirical estimates of critical values for the test statistics in order to function well. This requirement detracts from their usefulness.