Agreement among 2 × 2 Agreement Indices
- 1 June 1984
- journal article
- research article
- Published by SAGE Publications in Educational and Psychological Measurement
- Vol. 44 (2) , 301-314
- https://doi.org/10.1177/0013164484442012
Abstract
A variety of measures of reliability for two-category nominal scales are reviewed and compared. It is shown that upon correcting these indices for chance agreement, there are only five distinct indices: Fleiss's modification of A1, the φ coefficient, Cohen's kappa, and two intraclass coefficients. Additional derivations indicate that when marginals are held constant, all but one of the measures are linear functions of agreement and, thus, of one another. In particular, they are equal once the maximum obtainable values for a given data set are equated. The single exception is an intraclass correlation that explicitly includes variation due to observer mean differences as part of the error variance. This index is dependent on sample size; moreover, as the number of subjects increases, this index approaches the kappa coefficient as a limit. Recommendations for choosing an index of agreement are made based on definitions, magnitude, convenience, and consistency.Keywords
This publication has 13 references indexed in Scilit:
- Coefficient Kappa: Some Uses, Misuses, and AlternativesEducational and Psychological Measurement, 1981
- Measures of interobserver agreement: Calculation formulas and distribution effectsJournal of Psychopathology and Behavioral Assessment, 1981
- Interobserver agreement, reliability, and generalizability of data collected in observational studies.Psychological Bulletin, 1979
- The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of ReliabilityEducational and Psychological Measurement, 1973
- DERIVING COEFFICIENTS OF RELIABILITY AND AGREEMENT FOR RATINGSBritish Journal of Mathematical and Statistical Psychology, 1968
- A proposed index for measuring agreement in test-retest studiesJournal of Chronic Diseases, 1966
- The Intraclass Correlation Coefficient as a Measure of ReliabilityPsychological Reports, 1966
- The Measurement of Observer Disagreement in the Recording of SignsJournal of the Royal Statistical Society. Series A (General), 1966
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960
- Measures of the Amount of Ecologic Association Between SpeciesEcology, 1945