Assessing the reliability of clinical scales when the data have both nominal and ordinal features: Proposed guidelines for neuropsychological assessments

1 September 1992

journal article
research article
Published by Taylor & Francis in Journal of Clinical and Experimental Neuropsychology

Vol. 14 (5) , 673-686
https://doi.org/10.1080/01688639208402855

Abstract

The purpose of this article is to present, for the first time, a comprehensive methodology for assessing the reliability of a clinical scale that is frequently utilized in neuropsychological research and in biomedical studies, more generally. The dichotomous-ordinal scale is characterized by a single category of “absence” and two or more ordinalized categories of “presence” of a symptom trait, state, or behavior, and it also has special properties that need to be understood in order for its reliability to be appropriately assessed. Using the Brief Psychiatric Rating Scale (BPRS) as a clinical example, we cover the principles of expressing scale reliability in terms of a dichotomy (“absence”-“presence” of a given BPRS symptom); as a trichotomy (“none”; “mild to moderate” symptomatology; and “severe” symptomatology); and as the full 7-category dichotomous-ordinal scale: “none,” “very mild,” “mild,” “moderate,” “moderately severe,” “severe,” and “extremely severe.” Criteria are presented that can be used to evaluate which of these three formats produces the most reliable results. Finally, we address, with a second sample, the important issue of replication, or whether the original reliability findings generalize to other independent populations.

Keywords

This publication has 27 references indexed in Scilit:

High agreement but low kappa: II. Resolving the paradoxes
Journal of Clinical Epidemiology, 1990
When diagnostic agreement is high, but reliability is low: Some paradoxes occurring in joint independent neuropsychology assessments
Journal of Clinical and Experimental Neuropsychology, 1988
A Computer Program for Determining the Reliability of Dimensionally Scaled Data when the Numbers and Specific Sets of Examiners may Vary at Each Assessment
Educational and Psychological Measurement, 1988
The Effect of Number of Rating Scale Categories on Levels of Interrater Reliability : A Monte Carlo Investigation
Applied Psychological Measurement, 1985
RATCAT (Rater Agreement/Categorical Data)
The American Statistician, 1979
A Computer Program for Assessing Specific Category Rater Agreement for Qualitative Data
Educational and Psychological Measurement, 1978
Computer Programs for Assessing Rater Agreement and Rater Bias for Qualitative Data
Educational and Psychological Measurement, 1977
Comparison of the Null Distributions of Weighted Kappa and the C Ordinal Statistic
Applied Psychological Measurement, 1977
Assessing Inter-Rater Reliability for Rating Scales: Resolving some Basic Issues
The British Journal of Psychiatry, 1976
A Proposal for a New Method of Evaluation of the Newborn Infant.
Anesthesia & Analgesia, 1953