Efficiency of Multiple-Choice Tests as a Function of Spread of Item Difficulties

1 June 1952

journal article
Published by Cambridge University Press (CUP) in Psychometrika

Vol. 17 (2) , 127-147
https://doi.org/10.1007/bf02288778

Abstract

The validity of a univocal multiple-choice test is determined for varying distributions of item difficulty and varying degrees of item precision. Validity is a function of σ_d² + σ_y², where σ_d measures item unreliability and σ_y measures the spread of item difficulties. When this variance is very small, validity is high for one optimum cutting score, but the test gives relatively little valid information for other cutting scores. As this variance increases, eta increases up to a certain point, and then begins to decrease. Screening validity at the optimum cutting score declines as this variance increases, but the test becomes much more flexible, maintaining the same validity for a wide range of cutting scores. For items of the type ordinarily used in psychological tests, the test with uniform item difficulty gives greater over-all validity, and superior validity for most cutting scores, compared to a test with a range of item difficulties. When a multiple-choice test is intended to reject the poorest F per cent of the men tested, items should on the average be located at or above the threshold for men whose true ability is at the Fth percentile.

Keywords

This publication has 8 references indexed in Scilit:

A THEORY OF TEST SCORES AND THEIR RELATION TO THE TRAIT MEASURED
ETS Research Bulletin Series, 1951
Variation in Test Validity with Variation in the Distribution of Item Difficulties, Number of Items, and Degree of their Intercorrelation
Psychometrika, 1946
Maximum Validity of a Test with Equivalent Items
Psychometrika, 1946
The Relation of Item Difficulty and Inter-Item Correlation to Test Variance and Reliability
Psychometrika, 1945
The Effect of Difficulty and Chance Success on Correlations between Items or between Tests
Psychometrika, 1945
The Relation between the Difficulty and the Differential Validity of a Test
Psychometrika, 1936