Reliability of Test Scores and Decisions

A criterion-referenced test can be viewed as testing either a continuous or a binary variable, and the scores on a test can be used as measurements of the variable or to make decisions (e.g., pass or fail). Recent work on the reliability of criterion-refer enced tests has focused on the use of scores from tests of continuous variables for decision-making purposes. This work can be categorized according to type of loss function—threshold, linear, or quad ratic. It is the loss function that is used either ex plicitly or implicitly to evaluate the goodness of the decisions that are made on the basis of the test scores. The literature in which a threshold loss function is employed can be further subdivided ac cording to whether the goodness of decisions is as sessed as the probability of making an erroneous decision or as a measure of the consistency of deci sions over repeated testing occasions. This review points to the need for simple procedures by which to estimate the probability of decision errors.

