Appropriateness Measurement

Abstract
Cheating to raise scores (e.g., to qualify for some desired job or training) and deliberately missing test items to lower scores (e.g., to receive an exemption from military service in a period of general mobilization are both plausible threats to the integrity of multiple-choice tests. The goal of Appropriateness Measurement is to identify such aberrant test responding; the usual practice is the application of a mathematical procedure to an examinee's item responses which assigns a number (index) related to the probability of aberrant responding. Eleven appropriateness indices were investigated. Three Item Response Theory indices (Drasgow, Levine, and William's 1-naught and Tatsuoka's extended caution indices T2 and T4) were effective in detecting aberrant response patterns across a fairly wide range of conditions for a long (85-item) unidimensional test. Their effectiveness was much reduced on a short (30-item) unidimensional test. Methods were developed for combining information across several short unidimensional tests such as are typically found in aptitude batteries, and detection rates were obtained that were comparable to those for the long test. It is concluded that appropriateness indices based on Item Response Theory can be used effectively in operational test programs.

This publication has 0 references indexed in Scilit: