Interpreting the Correlation Coefficient when One of the Variables is Discrete
- 1 November 1986
- journal article
- research article
- Published by SAGE Publications in Journal of Dental Research
- Vol. 65 (11) , 1346-1348
- https://doi.org/10.1177/00220345860650111301
Abstract
The effect on the correlation coefficient of discretizing data was investigated in two ways. First, the theoretical effect of dichotomizing data was calculated, and it was shown that the resulting correlation coefficient is considerably less than that between the underlying bi-variate normally distributed variables. Second, computer simulations were performed of a model in which a continuous variable (measured with some error) gives rise to a counting variable through a mechanism in which the count is zero below a certain threshold value for the continuous variable and then increases linearly as the continuous variable increases. It was shown that the correlation coefficient between the observed values of the continuous and counting variables decreased as (a) the measurement error increased, (b) the slope of the relationship decreased, and (c) the number of counts decreased. It is concluded that caution is required when interpreting correlation coefficients when one or both of the variables consist of a few (say only four or five) discrete scores.This publication has 2 references indexed in Scilit:
- Relationship between dietary habits and caries increment assessed over two years in 405 English adolescent school childrenArchives of Oral Biology, 1984
- The Power Function of the "Exact" Test for Comparing Two Binomial DistributionsJournal of the Royal Statistical Society Series C: Applied Statistics, 1978