Reliability models for categorical data

1 January 1984

journal article
research article
Published by Taylor & Francis in Communications in Statistics - Theory and Methods

Vol. 13 (15) , 1851-1869
https://doi.org/10.1080/03610928408828799

Abstract

As assumed hypothetical consensus category corresponding to a case being classified provides a basis for assessment of reliability of judges. Equivalent judges are characterised by the joint probability distribution of the judge assignment and the consensus category. Estimates of the conditional probabilities of judge assignment given consensus category and of consensus category given judge assignments are indices of reliability. All parameters can be estimated if data include classifications of a number of cases by 3 or more judges. Restrictive assumptions are imposed to obtain models for data from classifications by two judges. Maximum likelihood estimation is discussed and illustrated by example for the 3 or more judges case.

Keywords

This publication has 22 references indexed in Scilit:

Analysis of Nonagreements among Multiple Raters
Published by JSTOR ,1983
Reliability Studies of Psychiatric Diagnosis
Archives of General Psychiatry, 1981
Estimating false alarms and missed events from interobserver agreement: A rationale.
Psychological Bulletin, 1980
Ramifications of a Population Model for κ as a Coefficient of Reliability
Psychometrika, 1979
ON THE METHODS AND THEORY OF RELIABILITY
Journal of Nervous & Mental Disease, 1976
Measuring nominal scale agreement among many raters.
Psychological Bulletin, 1971
MOMENTS OF THE STATISTICS KAPPA AND WEIGHTED KAPPA
British Journal of Mathematical and Statistical Psychology, 1968
Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit.
Psychological Bulletin, 1968
Estimating the Accuracy of Dichotomous Judgments
Psychometrika, 1965
A Coefficient of Agreement for Nominal Scales
Educational and Psychological Measurement, 1960