Abstract
The objective of this study is to compare two methods of detecting item bias within the framework of Rasch measurement. To accomplish this objective, it was first necessary to arrive at a clear understanding of the definition of bias as commonly used with Rasch measurement models. The comparison between the two methods was based on the Type I error rates in data that contain no bias and the power of the statistics to detect item bias when bias is present. The variables manipulated in this study included sample size, magnitude of bias, number of biased items present on the tests, and mean differences in the ability of the reference and focal groups. The two methods compared were the separate calibration t-test approach proposed by Wright and Stone in 1979 and the common calibration between-fit approach proposed by Wright, Mead, and Draba in 1976.The results indicate that the arbitrary use of bias levels such as +2 can result in the misidentification of biased items.

This publication has 5 references indexed in Scilit: