Interval estimation for Cohen's kappa as a measure of agreement
- 15 March 2000
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 19 (5) , 723-741
- https://doi.org/10.1002/(sici)1097-0258(20000315)19:5<723::aid-sim379>3.0.co;2-a
Abstract
Cohen's kappa statistic is a very well known measure of agreement between two raters with respect to a dichotomous outcome. Several expressions for its asymptotic variance have been derived and the normal approximation to its distribution has been used to construct confidence intervals. However, information on the accuracy of these normal‐approximation confidence intervals is not comprehensive. Under the common correlation model for dichotomous data, we evaluate 95 per cent lower confidence bounds constructed using four asymptotic variance expressions. Exact computation, rather than simulation is employed. Specific conditions under which the use of asymptotic variance formulae is reasonable are determined. Copyright © 2000 John Wiley & Sons, Ltd.Keywords
This publication has 33 references indexed in Scilit:
- Estimators of kappa-exact small sample propertiesJournal of Statistical Computation and Simulation, 1996
- Estimating Rater Agreement in 2 x 2 Tables: Correction for Chance and Intraclass CorrelationApplied Psychological Measurement, 1993
- Another look at interrater agreement.Psychological Bulletin, 1988
- Analysing Intraclass Correlation for Dichotomous VariablesJournal of the Royal Statistical Society Series C: Applied Statistics, 1988
- Confidence intervals for the interrater agreement measure kappaCommunications in Statistics - Theory and Methods, 1987
- Ramifications of a Population Model for κ as a Coefficient of ReliabilityPsychometrika, 1979
- A review of statistical methods in the analysis of data arising from observer reliability studies (Part I)*Statistica Neerlandica, 1975
- Theoretical StatisticsPublished by Springer Nature ,1974
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960
- Reliability of Content Analysis: The Case of Nominal Scale CodingPublic Opinion Quarterly, 1955