Learning how to Differ: Agreement and Reliability Statistics in Psychiatry
- 1 March 1995
- journal article
- other
- Published by SAGE Publications in The Canadian Journal of Psychiatry
- Vol. 40 (2) , 60-66
- https://doi.org/10.1177/070674379504000202
Abstract
Whenever two or more raters evaluate a patient or student, it may be necessary to determine the degree to which they assign the same label or rating to the subject. The major problem in deciding which statistic to use is the plethora of different techniques which are available. This paper reviews some of the more commonly used techniques, such as Raw Agreement, Cohen's kappa and weighted kappa, and shows that, in most circumstances, they can all be replaced by the intraclass correlation coefficient (ICC). This paper also shows how the ICC can be used in situations where the other statistics cannot be used and how to select the best subset of raters.Keywords
This publication has 26 references indexed in Scilit:
- Comparison of the Lay Diagnostic Interview Schedule and a Standardized Psychiatric DiagnosisArchives of General Psychiatry, 1985
- An application of kappa‐type analyses to interobserver variation in classifying chest radiographs for pneumoconiosisStatistics in Medicine, 1984
- Reliability Studies of Psychiatric DiagnosisArchives of General Psychiatry, 1981
- National Institute of Mental Health Diagnostic Interview ScheduleArchives of General Psychiatry, 1981
- Measures of interobserver agreement: Calculation formulas and distribution effectsJournal of Psychopathology and Behavioral Assessment, 1981
- Clinical biostatistics: LIV. The biostatistics of concordanceClinical Pharmacology & Therapeutics, 1981
- Integration and generalization of kappas for multiple raters.Psychological Bulletin, 1980
- Measuring nominal scale agreement among many raters.Psychological Bulletin, 1971
- Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit.Psychological Bulletin, 1968
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960