Cluster analysis and disease mapping--why, when, and how? A step by step guide

Abstract
Growing public awareness of environmental hazards has led to an increased demand for public health authorities to investigate geographical clustering of diseases. Although such cluster analysis is nearly always ineffective in identifying causes of disease, it often has to be used to address public concern about environmental hazards. Interpreting the resulting data is not straightforward, however, and this paper presents a guide for the non-specialist. The pitfalls include the fact that cluster analyses are usually done post hoc, and not as a result of a prior hypothesis. This is particularly true for investigations prompted by reported clusters, which have the inherent danger of overestimating the disease rate through “boundary shrinkage” of the population from which the cases are assumed to have arisen. In disease surveillance the problem of making multiple comparisons can be overcome by testing for clustering and autocorrelation. When rates of disease are illustrated in disease maps undue focus on areas where random fluctuation is greatest can be minimised by smoothing techniques. Despite the fact that cluster analyses rarely prove fruitful in identifying causation, they may—like single case reports—have the potential to generate new knowledge. Public awareness about potential hazards in our environment is growing. With the advent of powerful computing techniques that can be applied to routinely collected mortality and morbidity data, the demand on public health authorities to undertake investigations into geographical patterns of disease has increased. Nevertheless, several basic epidemiological and statistical issues may present obstacles to the satisfactory handling of such data.1 Although texts are available that cover recent developments,2 3 there is no obvious resource for the generalist reader covering methods for investigating disease clusters and clustering and for interpreting disease maps. This paper is intended to fill this gap by presenting a step by step guide to these problems for …