Outlier Measures and Norming Methods for Computerized Adaptive Tests
- 1 March 2001
- journal article
- Published by American Educational Research Association (AERA) in Journal of Educational and Behavioral Statistics
- Vol. 26 (1) , 85-104
- https://doi.org/10.3102/10769986026001085
Abstract
The problem of identifying outliers has two important aspects: the choice of outlier measures and the method to assess the degree of outlyingness (norming) of those measures. Several classes of measures for identifying outliers in Computerized Adaptive Tests (CATs) are introduced. Some of these measures are new and are constructed to take advantage of CATs’ sequential choice of items; other measures are taken directly from paper and pencil (P&P) tests and are used for baseline comparisons. Assessing the degree of outlyingness of CAT responses, however, can not be applied directly from P&P tests because stopping rules associated with CATs yield examinee responses of varying lengths. Standard outlier measures are highly correlated with the varying lengths which makes comparison across examinees impossible. Therefore, four methods are presented and compared which map outlier statistics to a familiar probability scale (a p value). The application of these methods to CAT data is new. The methods are explored in the context of CAT data from a 1995 Nationally Administered Computerized Examination (NACE).Keywords
This publication has 10 references indexed in Scilit:
- Bayesian Identification of Outliers in Computerized Adaptive TestsJournal of the American Statistical Association, 1998
- An Alternative Method for Scoring Adaptive TestsJournal of Educational and Behavioral Statistics, 1996
- A Method for Severely Constrained Item Selection in Adaptive TestingApplied Psychological Measurement, 1993
- Predictive Inference: An IntroductionPublished by Springer Nature ,1993
- Optimal Appropriateness MeasurementPsychometrika, 1988
- Bayesianly Justifiable and Relevant Frequency Calculations for the Applied StatisticianThe Annals of Statistics, 1984
- A Nonparametric Approach to the Analysis of Dichotomous Item ResponsesApplied Psychological Measurement, 1982
- Choice of Test Model for Appropriateness MeasurementApplied Psychological Measurement, 1982
- Sampling and Bayes' Inference in Scientific Modelling and RobustnessJournal of the Royal Statistical Society. Series A (General), 1980
- Measuring the Appropriateness of Multiple-Choice Test ScoresJournal of Educational Statistics, 1979