Abstract
This study compared the reliability and validity indexes of randomly parallel tests administered under inclusion, exclusion, and correction for guessing directions. It also compared the criterion-referenced grading decisions based on the different scoring methods. Inclusion and exclusion scores were not so highly correlated as theory would predict. There were no significant differences in the reliability and validity indices for the three scoring methods. However, the scoring methods differed substantially in the proportion of students assigned to different grade categories.