Abstract
With the recent surge of interest in epidemiology and statistics in causal inference, the adage ‘correlation is not causation’ has been repeated so often that another salient feature of the relationship of correlation to causation seems virtually to have been forgotten: that correlation is a necessary (but not sufficient) condition for causation. [In fact, there are theoretical exceptions to this as well; for example, there could be a causal relationship that has a U-shaped dose–response curve—in this case, the Spearman (and Pearson) correlation coefficients will be 0.] It is rather easy to come up with theoretical examples—straw man arguments, in fact—where correlation is present and causation not. For example, having ‘yellow fingers’ and getting lung cancer are correlated, but it is intuitively obvious—although this may not always have been the case—that yellow fingers do not cause lung cancer and lung cancer does not cause yellow fingers. Although typically absurd, theoretical examples such as these—designed to make people feel silly for erroneously linking correlation to causation—abound, I claim that although correlation does not (always) imply causation, it does so most of the time! In fact, because most of the time, correlation does imply causation, the human mind has extrapolated beyond its probabilistic experience to incorrectly link the two deterministically. By doing so, one will be right, most of the time. Which I guess was a good enough modus operandi for our species to go forth and multiply quite successfully into the 21st century! As statisticians and/or those who rely on probability and statistics as the major tool for uncovering truth, basing decisions and recommendations on what has observed to have happened most of the time is exactly what we do for a living, and it works (most of the time). Although I myself have often taught in introductory epidemiology classes that the major limitation of cross-sectional studies is that it is not possible to distinguish between cause and effect, I wonder how often ‘reverse causation’ has incorrectly been deduced from a cross-sectional study? In summary, although correlation is not fool proof as a means to quantify the strength of a causal relationship, it often does a damn good job, and is certainly a well-honed point of departure for more confirmatory research. Correlation plays a critical role in scientific discovery and innovation—Spearman and Pearson, thank you very much.