Effect of dichotomizinlg a continuous variable on the model structure in multiple linear regression models
- 1 January 2000
- journal article
- research article
- Published by Taylor & Francis in Communications in Statistics - Theory and Methods
- Vol. 29 (3) , 643-654
- https://doi.org/10.1080/03610920008832507
Abstract
This mansscript studies analytically the consequences of changing the scale of measurement of a continuous independent variable in a multiple linear regression setting. Assuming the continuous outcome variable, a continuous exposure variable, and a continuous control variable follow a trivariate Gaussian distribution, we examine the effect upon the structure of the modei of dichotomizing the continuous control variable. It is shown that, after dichotomizaiion, the condirionai expected vaiiie of the response is a quotient of two non-hear functions and hence is no longer linear in the exposure variable. Thus, when an underlying continuous independent variable is dichotomized in multiple linear regression, and one fits a linear model using the dichotomous variable, this model's linear structure is misspecified. The estimates obtained from this model are incorrect and potentially misleading.Keywords
This publication has 8 references indexed in Scilit:
- Second-Order Properties of a Two-Stage Fixed-Size Confidence Region for the Mean Vector of a Multivariate Normal DistributionJournal of Multivariate Analysis, 1999
- Dangers of Using "Optimal" Cutpoints in the Evaluation of Prognostic FactorsJNCI Journal of the National Cancer Institute, 1994
- The concept of residual confounding in regression models and some applicationsStatistics in Medicine, 1992
- Differential Misclassification Arising from Nondifferential Errors in Exposure MeasurementAmerican Journal of Epidemiology, 1991
- Methodological Issues in Case-Control Studies III:—The Effect of Joint Misclassification of Risk Factors and Confounding Factors upon Estimation and PowerInternational Journal of Epidemiology, 1984
- TESTS OF SIGNIFICANCE FOR THE LATENT ROOTS OF COVARIANCE AND CORRELATION MATRICESBiometrika, 1956
- A Two-Sample Test for a Linear Hypothesis Whose Power is Independent of the VarianceThe Annals of Mathematical Statistics, 1945
- A Formula for Sample Sizes for Population Tolerance LimitsThe Annals of Mathematical Statistics, 1944