How many raters? toward the most reliable diagnostic consensus

1 January 1992

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 11 (3) , 317-331
https://doi.org/10.1002/sim.4780110305

Abstract

When faced with a decision whether or not to treat a patient, to enter or to withdraw a patient from a clinical trial, or any other such binary decision, based on diagnosis with unsatisfactory reliability, can a consensus diagnosis be used to improve reliability? If so, exactly how? That is the question I address here. I draw comparisons and contrasts between the known results with an interval consensus and those with a binary consensus and suggest tactics for use in a pilot study to answer the above questions.

Keywords

This publication has 17 references indexed in Scilit:

Using association models to analyse agreement data: Two examples
Statistics in Medicine, 1989
Assessment of 2 × 2 Associations: Generalization of Signal-Detection Methodology
The American Statistician, 1988
Kappa coefficients in epidemiology: An appraisal of a reappraisal
Journal of Clinical Epidemiology, 1988
Category Distinguishability and Observer Agreement
Australian Journal of Statistics, 1986
Modeling ordinal scale disagreement.
Psychological Bulletin, 1985
Estimating false alarms and missed events from interobserver agreement: Comment on Kaye.
Psychological Bulletin, 1982
Estimating false alarms and missed events from interobserver agreement: A rationale.
Psychological Bulletin, 1980
Measuring nominal scale agreement among many raters.
Psychological Bulletin, 1971
Errors of Measurement in Statistics
Technometrics, 1968