Evaluating evaluation

Abstract
The American Board of Internal Medicine suggests use of a standard form to rate residents on nine dimensions (such as clinical judgment and overall clinical competence) on a scale of 1 to 9. The authors examined the psychometric evidence for reliability and validity of 1,039 ratings of 85 residents by 135 attendings in a single internal medicine residency program. Of these ratings, 95.6% were from 6 to 9. Factor analysis revealed that high correlations among the nine dimensions (r ranged from 0.72 to 0.92) resulted from a single global factor accounting for 86% of the variance. The study also examined whether the form reliably distinguishes among residents scoring between 6 and 9. Agreement among attendings rating the same individual was weak (average reliability=0.64, by the method of James). The rating method fails to discriminate dimensions of clinical care and has low reliability for distinguishing among competent residents.