Sensitivity of Equating Results to Different Sampling Strategies
- 1 January 1990
- journal article
- Published by Taylor & Francis in Applied Measurement in Education
- Vol. 3 (1) , 53-71
- https://doi.org/10.1207/s15324818ame0301_5
Abstract
In this article, the results of equating two parallel forms of the College Board Biology Achievement Test using three different sampling strategies are discussed. New-form data were collected during a fall administration of the test, and old-form data were collected at a spring administration. The group taking the test in the spring was much more able, as measured by test score, than the group taking the test in the fall. The three sampling strategies studied were representative sampling, matched sampling, and reference or target sampling. For each sampling strategy, five equating procedures were studied: Tucker and Levine unequally reliable linear equatings, frequency estimation equipercentile and chained equipercentile curvilinear equatings, and three-parameter logistic (3PL) item response theory (IRT) true-score equating. The criterion for comparison in all cases was the results of a Tucker linear equating from a fall new-form/fall old-form representative sampling data collection design. Results of this study indicated that matching on a set of common items provided greater agreement among the results of the various equating procedures studied than were obtained under representative sampling. In addition, for all equating procedures, the results of equating with samples matched on common item scores agreed more closely with the criterion equating than did the equating results from representative samples. Matching to an external target population produced agreement among methods, but did not agree as closely with the criterion equating as matching to the new form on the basis of common item scores. The equating models least affected by differences in new-form and old-form sample abilities were the Tucker and frequency estimation equipercentile models and the procedure most affected by ability differences was the 3PL IRT procedure.Keywords
This publication has 7 references indexed in Scilit:
- EQUATING ACHIEVEMENT TESTS USING SAMPLES MATCHED ON ABILITYETS Research Report Series, 1990
- Effect on Equating Results of Matching Samples on an Anchor TestApplied Measurement in Education, 1990
- Equating Methods and Sampling DesignsApplied Measurement in Education, 1990
- A Comparative Study of the Effects of Recency of Instruction on the Stability of IRT and Conventional Item Parameter EstimatesJournal of Educational Measurement, 1988
- The Use of Presnloothing and Postsnloothing to Increase the Precision of Equipercentile EquatingApplied Psychological Measurement, 1987
- Smoothing the joint and marginal distributions of scored two‐way contingency tables in test equatingBritish Journal of Mathematical and Statistical Psychology, 1987
- PRACTICAL APPLICATIONS OF ITEM CHARACTERISTIC CURVE THEORY*Journal of Educational Measurement, 1977