Outlier Measures and Norming Methods for Computerized Adaptive Tests

Abstract

The problem of identifying outliers has two important aspects: the choice of outlier measures and the method to assess the degree of outlyingness (norming) of those measures. Several classes of measures for identifying outliers in Computerized Adaptive Tests (CATs) are introduced. Some of these measures are new and are constructed to take advantage of CATs’ sequential choice of items; other measures are taken directly from paper and pencil (P&P) tests and are used for baseline comparisons. Assessing the degree of outlyingness of CAT responses, however, can not be applied directly from P&P tests because stopping rules associated with CATs yield examinee responses of varying lengths. Standard outlier measures are highly correlated with the varying lengths which makes comparison across examinees impossible. Therefore, four methods are presented and compared which map outlier statistics to a familiar probability scale (a p value). The application of these methods to CAT data is new. The methods are explored in the context of CAT data from a 1995 Nationally Administered Computerized Examination (NACE).

Keywords

This publication has 10 references indexed in Scilit:

Bayesian Identification of Outliers in Computerized Adaptive Tests
Journal of the American Statistical Association, 1998
An Alternative Method for Scoring Adaptive Tests
Journal of Educational and Behavioral Statistics, 1996
A Method for Severely Constrained Item Selection in Adaptive Testing
Applied Psychological Measurement, 1993
Predictive Inference: An Introduction
Published by Springer Nature ,1993
Optimal Appropriateness Measurement
Psychometrika, 1988
Bayesianly Justifiable and Relevant Frequency Calculations for the Applied Statistician
The Annals of Statistics, 1984
A Nonparametric Approach to the Analysis of Dichotomous Item Responses
Applied Psychological Measurement, 1982
Choice of Test Model for Appropriateness Measurement
Applied Psychological Measurement, 1982
Sampling and Bayes' Inference in Scientific Modelling and Robustness
Journal of the Royal Statistical Society. Series A (General), 1980
Measuring the Appropriateness of Multiple-Choice Test Scores
Journal of Educational Statistics, 1979