Interpreting the results of observational research: chance is not such a fine thing
- 17 September 1994
- Vol. 309 (6956) , 727-730
- https://doi.org/10.1136/bmj.309.6956.727
Abstract
In a randomised controlled trial, if the design is not flawed, different outcomes in the study groups must be due to the intervention itself or to chance imbalances between the groups. Because of this tests of statistical significance are used to assess the validity of results from randomised studies. Most published papers in medical research, however, describe observational studies which do not include randomised intervention. This paper argues that the continuing application of tests of significance to such non-randomised investigations is inappropriate. It draws a distinction between bias and chance imbalance on the one hand (both randomised and observational studies can be affected) and confounding on the other (a unique problem for observational investigations). It concludes that neither the P value nor the 95% confidence interval should be used as evidence for the validity of an observational result. Epidemiologists and clinical researchers design studies to estimate the effect which a presumed cause or treatment has on the occurrence of a disease. Most questions about causes of disease cannot be addressed by experiments: we must rely on the observation of life as it is, rather than of the results of controlled intervention. Such observational studies cannot provide proof of causality but are still the basis for reasoned public health decisions. In the presentation of results from observational studies significance tests are often presented as judgments on the “truth” or validity of the effect which a presumed cause has on the occurrence of a disease. In 1965 Bradford Hill lamented this application of statistics,1 a concern given prominence again recently.2 Yet almost 30 years on, phrases such as “the result just failed to reach statistical significance” are still part of the argot of medical papers and presentations. The move towards estimating confidence intervals has not resolved this problem, as the …Keywords
This publication has 13 references indexed in Scilit:
- Bias in analytic researchPublished by Elsevier ,2004
- The glitter of the t tableThe Lancet, 1993
- Smoking as "independent" risk factor for suicide: illustration of an artifact from observational epidemiology?The Lancet, 1992
- Prevention of neural tube defects: Results of the Medical Research Council Vitamin StudyPublished by Elsevier ,1991
- Randomization, Statistics, and Causal InferenceEpidemiology, 1990
- Beyond the confidence interval.American Journal of Public Health, 1987
- Confidence intervals rather than P values: estimation rather than hypothesis testing.BMJ, 1986
- FURTHER EXPERIENCE OF VITAMIN SUPPLEMENTATION FOR PREVENTION OF NEURAL TUBE DEFECT RECURRENCESThe Lancet, 1983
- CONTROLLED, RANDOMISED TRIAL OF THE EFFECT OF DIETARY FAT ON BLOOD PRESSUREThe Lancet, 1983
- A Show of ConfidenceNew England Journal of Medicine, 1978