The inter-rater reliability and internal consistency of a clinical evaluation exercise
- 1 March 1992
- journal article
- research article
- Published by Springer Nature in Journal of General Internal Medicine
- Vol. 7 (2) , 174-179
- https://doi.org/10.1007/bf02598008
Abstract
Objective:To assess the internal consistency and interrater reliability of a clinical evaluation exercise (CEX) format that was designed to be easily utilized, but sufficiently detailed, to achieve uniform recording of the observed examination. Design:A comparison of 128 CEXs conducted for 32 internal medicine interns by full-time faculty. This paper reports alpha coefficients as measures of internal consistency and several measures of inter-rater reliability. Setting:A university internal medicine program. Observations were conducted at the end of the internship year. Participants:Participants were 32 interns and observers were 12 full-time faculty in the department of medicine. The entire intern group was chosen in order to optimize the spectrum of abilities represented. Patients used for the study were recruited by the chief resident from the inpatient medical service based on their ability and willingness to participate. Intervention:Each intern was observed twice and there were two examiners during each CEX. The examiners were given a standardized preparation and used a format developed over five years of previous pilot studies. Measurements and main results:The format appeared to have excellent internal consistency; alpha coefficients ranged from 0.79 to 0.99. However, multiple methods of determining inter-rater reliability yielded similar results; intraclass correlations ranged from 0.23 to 0.50 and generalizability coefficients from a low of 0.00 for the overall rating of the CEX to a high of 0.61 for the physical examination section. Transforming scores to eliminate rater effects and dichotomizing results into pass-fail did not appear to enhance the reliability results. Conclusions:Although the CEX is a valuable didactic tool, its psychometric properties preclude reliable assessment of clinical skills as a one-time observation.Keywords
This publication has 26 references indexed in Scilit:
- Psychometric characteristics of the objective structured clinical examinationMedical Education, 1988
- An objective measure of clinical performanceThe American Journal of Medicine, 1987
- A survey of clinical skills evaluation practices in internal medicine residency programsAcademic Medicine, 1984
- Performance rating.Psychological Bulletin, 1980
- The use of instructor-patients to teach physical examination techniquesAcademic Medicine, 1978
- Assessment of clinical competence using objective structured examination.BMJ, 1975
- Utilization of simulated patients to teach the routine pelvic examinationAcademic Medicine, 1974
- Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit.Psychological Bulletin, 1968
- Direct observation as a means of teaching and evaluating clinical skillsAcademic Medicine, 1966
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960