Assessing intrarater, interrater and test–retest reliability of continuous measurements

Top Cited Papers

24 October 2002

journal article
research article
Published by Wiley in Statistics in Medicine

Vol. 21 (22) , 3431-3446
https://doi.org/10.1002/sim.1253

Abstract

In this paper we review the problem of defining and estimating intrarater, interrater and test–retest reliability of continuous measurements. We argue that the usual notion of product‐moment correlation is well adapted in a test–retest situation, whereas the concept of intraclass correlation should be used for intrarater and interrater reliability. The key difference between these two approaches is the treatment of systematic error, which is often due to a learning effect for test–retest data. We also consider the reliability of a sum and a difference of variables and illustrate the effects on components. Further, we compare these approaches of reliability with the concept of limits of agreement proposed by Bland and Altman (for evaluating the agreement between two methods of clinical measurements) and show how product‐moment correlation is related to it. We then propose new kinds of limits of agreement which are related to intraclass correlation. A test battery to study the development of neuro‐motor functions in children and adolescents illustrates our purpose throughout the paper. Copyright 2002 John Wiley & Sons, Ltd.

Keywords

This publication has 13 references indexed in Scilit:

Neuromotor development from 5 to 18 years. Part 1: timed performance
Developmental Medicine and Child Neurology, 2001
Higher-moment approaches to approximate interval estimation for a certain intraclass correlation coefficient
Statistics in Medicine, 1999
A note on the use of the intraclass correlation coefficient in the evaluation of agreement between two methods of measurement
Computers in Biology and Medicine, 1990
Statistical evaluation of agreement between two methods for measuring a quantitative variable
Computers in Biology and Medicine, 1989
STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENT
The Lancet, 1986
Approximate Interval Estimation for a Certain Intraclass Correlation Coefficient
Psychometrika, 1978
The Intraclass Correlation Coefficient as a Measure of Reliability
Psychological Reports, 1966
Reliability Formulas for Independent Decision Data When Reliability Data are Matched
Psychometrika, 1960
On the Comparative Anatomy of Transformations
The Annals of Mathematical Statistics, 1957
VI. Mathematical contributions to the theory of evolution. —VI. Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses
Philosophical Transactions of the Royal Society A, 1899