When diagnostic agreement is high, but reliability is low: Some paradoxes occurring in joint independent neuropsychology assessments
- 1 October 1988
- journal article
- research article
- Published by Taylor & Francis in Journal of Clinical and Experimental Neuropsychology
- Vol. 10 (5) , 605-622
- https://doi.org/10.1080/01688638808402799
Abstract
Two paradoxes can occur when neuropsychologists attempt to assess the reliability of a dichotomous diagnostic instrument (e.g., one measuring the presence or absence of Dyslexia or Autism). The first paradox occurs when two pairs of examiners both produce the same high level of agreement (e.g., 85%). Nonetheless, the level of chance-corrected agreement is relatively high for one pair (e.g., .70) and quite low for the other (e.g., .32). To illustrate the second paradox, consider two examiners who are in 80% agreement in their overall diagnosis of Dyslexia. Assume, further, that they are in 100% agreement in the proportion of cases they both diagnose as Dyslexic (20%) and as Non-Dyslexic (80%). Somewhat paradoxically, the level of chance-corrected interexaminer agreement for this pair of examiners calculates to only .37. In distinct contrast, a second set of examiners also in 80% overall agreement, is in appreciable disagreement with respect to diagnostic assignments. Thus, the first neuropsychologist: (a) classifies 65% of the cases as Non-Dyslexic, as opposed to 45% so diagnosed by the second neuropsychologist; and (b) classifies the remaining 35% as Dyslexic, as compared to the 55% so classified by the second examiner. Despite these phenomena, this second pair of examiners produces a much higher level of chance-corrected agreement than did the first pair, that is, a value of .61. The underlying reasons for both of these paradoxes, as well as their resolution, are presented.This publication has 18 references indexed in Scilit:
- Null Hypothesis Disrespect in Neuropsychology: Dangers of Alpha and Beta ErrorsJournal of Clinical and Experimental Neuropsychology, 1988
- MISINTERPRETATION AND MISUSE OF THE KAPPA STATISTICAmerican Journal of Epidemiology, 1987
- The Cost of DichotomizationApplied Psychological Measurement, 1983
- A Computer Program for Assessing Specific Category Rater Agreement for Qualitative DataEducational and Psychological Measurement, 1978
- Coefficients of Agreement Between Observers and Their InterpretationThe British Journal of Psychiatry, 1977
- VALIDITY OF CLINICAL EXAMINATION AND MAMMOGRAPHY AS SCREENING TESTS FOR BREAST CANCERThe Lancet, 1975
- A Re-analysis of the Reliability of Psychiatric DiagnosisThe British Journal of Psychiatry, 1974
- Large sample standard errors of kappa and weighted kappa.Psychological Bulletin, 1969
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960
- Measures of Association for Cross Classifications*Journal of the American Statistical Association, 1954