A Comparison of Global Ratings and Checklist Scores from an Undergraduate Assessment Using an Anesthesia Simulator

Abstract
To determine the correlation between global ratings and criterion-based checklist scores, and inter-rater reliability of global ratings and criterion-based checklist scores, in a performance assessment using an anesthesia simulator. All final-year medical students at the University of Toronto were invited to work through a 15-minute faculty-facilitated scenario using an anesthesia simulator. Students' performances were videotaped and analyzed by two faculty using a 25-point criterion-based checklist and a five-point global rating of competency (1 = clear failure, 5 = superior performance). Correlations between global ratings and checklist scores, as well as specific performance competencies (knowledge, technical skills, and judgment), were determined. Checklist and global scores were converted to percentages; means of the two marks were compared. Mean reliability of a single rater for both checklist and global ratings was determined. The correlation between checklist and global ratings was .74. Mean ratings of both checklist and global scores were low (58.67, SD = 14.96, and 57.08, SD = 24.27, respectively); these differences were not statistically significant. For a single rater, the mean reliability score across rater pairs for checklist scores was .77 (range .58–.93). Mean reliability score across rater pairs for global ratings was .62 (.40–.77). Global ratings correlated more highly with technical skills and judgment (r = .51 and r = .53, respectively) than with knowledge. (r = .24) Inter-rater reliability was higher for checklist scores than for global ratings; however, global ratings demonstrated acceptable inter-rater reliability and may be useful for competency assessment in performance assessments using simulators.