Validity of item selection: A comparison of automated computerized adaptive and manual paper and pencil examinations

Abstract
Background: As computerized adaptive testing (CAT) becomes more prevalent, it is important to confirm that the computerized adaptive and paper‐and‐pencil (P&P) examinations offer comparable validity and statistical performance. Purpose: The purpose of this study was to investigate the validity and statistical properties of automated item selection (CAT) compared to manual item selection (P&P). Methods: A committee of specialists rated computerized adaptive tests (CATs) and P&P examinations with regard to face validity, adherence to test specifications, ordering of items, and cognitive skill distribution. The psychometric properties were compared. Results: Results indicated that the CATs and P&P examinations were comparable for face, content, and construct validity, as well as psychometric characteristics. Conclusions: Tests constructed automatically by the computer or manually for P&P can meet the criteria for validity and statistical performance. These findings generalize to any carefully developed examination program.

This publication has 1 reference indexed in Scilit: