Reliability of assessment tools in rehabilitation: an illustration of appropriate statistical analyses
- 1 June 1998
- journal article
- research article
- Published by SAGE Publications in Clinical Rehabilitation
- Vol. 12 (3) , 187-199
- https://doi.org/10.1191/026921598672178340
Abstract
Objective: To provide a practical guide to appropriate statistical analysis of a reliability study using real-time ultrasound for measuring muscle size as an example. Design: Inter-rater and intra-rater (between-scans and between-days) reliability. Subjects: Ten normal subjects (five male) aged 22–58 years. Method: The cross-sectional area (CSA) of the anterior tibial muscle group was measured using real-time ultrasonography. Main outcome measures: Intraclass correlation coefficients (ICCs) and the 95% confidence interval (CI) for the ICCs, and Bland and Altman method for assessing agreement, which includes calculation of the mean difference between measures ( d), the 95% CI for d, the standard deviation of the differences (SD diff), the 95% limits of agreement and a reliability coefficient. Results: Inter-rater reliability was high, ICC (3,1) was 0.92 with a 95% CI of 0.72 → 0.98. There was reasonable agreement between measures on the Bland and Altman test, as d was -0.63 cm2, the 95% CI for d was -1.4 → 0.14 cm2, the SDdiff was 1.08 cm2, the 95% limits of agreement -2.73 → 1.53 cm2 and the reliability coefficient was 2.4. Between-scans repeatability was high, ICCs (1,1) were 0.94 and 0.93 with 95% CIs of 0.8 → 0.99 and 0.75 → 0.98, for days 1 and 2 respectively. Measures showed good agreement on the Bland and Altman test: d for day 1 was 0.15 cm2 and for day 2 it was -0.32 cm2, the 95% CIs for d were -0.51 → 0.81 cm2 for day 1 and -0.98 → 0.34 cm2 for day 2; SDdiff was 0.93 cm2 for both days, the 95% limits of agreement were -1.71 → 2.01 cm2 for day 1 and -2.18 → 1.54 cm2for day 2; the reliability coefficient was 1.80 for day 1 and 1.88 for day 2. The between-days ICC (1,2) was 0.92 and the 95% CI 0.69 0.98. The d was -0.98 cm2, the SDdiff was 1.25 cm2 with 95% limits of agreement of -3.48 → 1.52 cm2 and the reliability coefficient 2.8. The 95% CI for d(-1.88 → -0.08 cm2) and the distribution graph showed a bias towards a larger measurement on day 2. Conclusions: The ICC and Bland and Altman tests are appropriate for analysis of reliability studies of similar design to that described, but neither test alone provides sufficient information and it is recommended that both are used.Keywords
This publication has 14 references indexed in Scilit:
- Musculoskeletal ultrasound imaging: diagnostic and treatment aid in rehabilitationPhysical Therapy Reviews, 1997
- Symmetry of anterior tibial muscle size measured by real-time ultrasound imaging in young femalesClinical Rehabilitation, 1993
- Statistical methods for assessing observer variability in clinical measures.BMJ, 1992
- Measurement of anterior tibial muscle size using real-time ultrasound imagingEuropean Journal of Applied Physiology, 1991
- The assessment of methods of measurementStatistics in Medicine, 1990
- Intrasession and Intersession Reliability of Hand-held Dynamometer Measurements Taken on Brain-damaged PatientsPTJ: Physical Therapy & Rehabilitation Journal, 1989
- STATISTICAL METHODS FOR ASSESSING AGREEMENT BETWEEN TWO METHODS OF CLINICAL MEASUREMENTThe Lancet, 1986
- Measuring Agreement for Multinomial DataPublished by JSTOR ,1982
- Intraclass correlations: Uses in assessing rater reliability.Psychological Bulletin, 1979
- The Intraclass Correlation Coefficient as a Measure of ReliabilityPsychological Reports, 1966