ESTIMATING THE RELIABILITY, VALIDITY, AND INVALIDITY OF ESSAY RATINGS

1 March 1985

journal article
Published by Wiley in Journal of Educational Measurement

Vol. 22 (1) , 41-52
https://doi.org/10.1111/j.1745-3984.1985.tb01048.x

Abstract

In an essay rating study multiple ratings may be obtained by having different raters judge essays or by having the same rater(s) repeat the judging of essays. An important question in the analysis of essay ratings is whether multiple ratings, however obtained, may be assumed to represent the same true scores. When different raters judge the same essays only once, it is impossible to answer this question. In this study 16 raters judged 105 essays on two occasions; hence, it was possible to test assumptions about true scores within the framework of linear structural equation models. It emerged that the ratings of a given rater on the two occasions represented the same true scores. However, the ratings of different raters did not represent the same true scores. The estimated intercorrelations of the true scores of different raters ranged from .415 to .910. Parameters of the best fitting model were used to compute coefficients of reliability, validity, and invalidity. The implications of these coefficients are discussed.

Keywords

This publication has 12 references indexed in Scilit:

Using Longitudinal Data to Estimate Reliability
Applied Psychological Measurement, 1983
Using Longitudinal Data to Estimate Reliability in the Presence of Correlated Measurement Errors
Educational and Psychological Measurement, 1980
How characteristics of student essays influence teachers' evaluations.
Journal of Educational Psychology, 1979
Expected grade covariation with student ratings of instruction: Individual versus class effects.
Journal of Educational Psychology, 1979
DETECTION OF CORRELATED ERRORS IN LONGITUDINAL DATA
British Journal of Mathematical and Statistical Psychology, 1975
Statistical Analysis of Sets of Congeneric Tests
Psychometrika, 1971
Comment on "The Estimation of Measurement Error in Panel Data"
American Sociological Review, 1971
The Estimation of Measurement Error in Panel Data
American Sociological Review, 1970
Validity, Invalidity, and Reliability
Sociological Methodology, 1970
Generalizability of Scores Influenced by Multiple Sources of Variance
Psychometrika, 1965