Correcting Performance-Rating Errors in Oral Examinations
- 1 March 1991
- journal article
- Published by SAGE Publications in Evaluation & the Health Professions
- Vol. 14 (1) , 100-122
- https://doi.org/10.1177/016327879101400107
Abstract
Although oral examinations are widely used for making decisions regarding an individual s level of competence, they are frequently of limited reliability. A significant part of the error in oral performance ratings is due to the tendency for some evaluators to be lenient and others to be stringent in their assignment of ratings. This article describes and evaluates a simple method to identify and correct for errors of leniency and stringency. The method, which is based on a regression model recommended by Wilson (1988), extends and simplifies the procedures recommended by Cason and Cason (1984, 1985). The method provides an estimate of each individual's performance that has been corrected for errors of leniency and stringency. In addition, it produces for each rater an index of leniency or stringency and several other statistics useful in evaluating the properties of rating data. The regression method is applied to performance ratings from three separate administrations of an oral examination in a medical specialty. The results indicate modest but significant levels of leniency and stringency error; correcting for such errors would change the pass/fail decisions for about 6% of the examinees. Limitations of the procedure, as well as the need for additional research, ore discussed.Keywords
This publication has 10 references indexed in Scilit:
- Assessment of clinical skills with standardized patients: State of the artTeaching and Learning in Medicine, 1990
- Parameter Estimation for Peer Grading under Incomplete DesignEducational and Psychological Measurement, 1988
- Missing Data in Evaluation ResearchEvaluation & the Health Professions, 1986
- A Deterministic Theory of Clinical Performance RatingEvaluation & the Health Professions, 1984
- Two Simple Models for Rater EffectsApplied Psychological Measurement, 1984
- Balanced Incomplete Block Designs for Inter-Rater Reliability StudiesApplied Psychological Measurement, 1981
- Effects of rater training: Creating new response sets and decreasing accuracy.Journal of Applied Psychology, 1980
- Effects of rater training on leniency and halo errors in student ratings of instructors.Journal of Applied Psychology, 1978
- THE VALIDITY AND RELIABILITY OF ORAL EXAMINATIONS IN ASSESSING COGNITIVE SKILLS IN MEDICINE1Journal of Educational Measurement, 1970
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960