Agreement among 2 × 2 Agreement Indices

1 June 1984

journal article
research article
Published by SAGE Publications in Educational and Psychological Measurement

Vol. 44 (2) , 301-314
https://doi.org/10.1177/0013164484442012

Abstract

A variety of measures of reliability for two-category nominal scales are reviewed and compared. It is shown that upon correcting these indices for chance agreement, there are only five distinct indices: Fleiss's modification of A₁, the φ coefficient, Cohen's kappa, and two intraclass coefficients. Additional derivations indicate that when marginals are held constant, all but one of the measures are linear functions of agreement and, thus, of one another. In particular, they are equal once the maximum obtainable values for a given data set are equated. The single exception is an intraclass correlation that explicitly includes variation due to observer mean differences as part of the error variance. This index is dependent on sample size; moreover, as the number of subjects increases, this index approaches the kappa coefficient as a limit. Recommendations for choosing an index of agreement are made based on definitions, magnitude, convenience, and consistency.

Keywords

This publication has 13 references indexed in Scilit:

Coefficient Kappa: Some Uses, Misuses, and Alternatives
Educational and Psychological Measurement, 1981
Measures of interobserver agreement: Calculation formulas and distribution effects
Journal of Psychopathology and Behavioral Assessment, 1981
Interobserver agreement, reliability, and generalizability of data collected in observational studies.
Psychological Bulletin, 1979
The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of Reliability
Educational and Psychological Measurement, 1973
DERIVING COEFFICIENTS OF RELIABILITY AND AGREEMENT FOR RATINGS
British Journal of Mathematical and Statistical Psychology, 1968
A proposed index for measuring agreement in test-retest studies
Journal of Chronic Diseases, 1966
The Intraclass Correlation Coefficient as a Measure of Reliability
Psychological Reports, 1966
The Measurement of Observer Disagreement in the Recording of Signs
Journal of the Royal Statistical Society. Series A (General), 1966
A Coefficient of Agreement for Nominal Scales
Educational and Psychological Measurement, 1960
Measures of the Amount of Ecologic Association Between Species
Ecology, 1945