THE PROBLEM OF MULTIPLE INFERENCE IN STUDIES DESIGNED TO GENERATE HYPOTHESES

1 December 1985

journal article
research article
Published by Oxford University Press (OUP) in American Journal of Epidemiology

Vol. 122 (6) , 1080-1095
https://doi.org/10.1093/oxfordjournals.aje.a114189

Abstract

Epidemiologic research often involves the simultaneous assessment of associations between many risk factors and several disease outcomes. In such situations, often designed to generate hypotheses, multiple univariate hypothesis-testing is not an appropriate basis for inference. The number of true positive associations in a collection of many associations can be estimated by comparing the observed distribution of p values for the positive associations to a theoretical uniform distribution, or to the observed distribution of negative associations, or to an empiric randomization distribution. None of these approaches, however, will distinguish the true from the false positive associations. Various criteria for selecting a subset of associations to report are considered by the authors, including Bonferoni adjustment of p values, splitting the sample for searching and testing, Bayesian inference, and decision theory. The authors prefer an approach in which all associations in the data are reported, whether significant or not, followed by a ranking in order of priority for investigation using empirical Bayes techniques. Methods are illustrated by application to preliminary data from a study aimed at identifying hitherto unsuspected occupational carcinogens.

Keywords

This publication has 3 references indexed in Scilit:

Preliminary report of an exposure-based, case-control monitoring system for discovering occupational carcinogens
Teratogenesis, Carcinogenesis, and Mutagenesis, 1982
Coffee and Cancer of the Pancreas
New England Journal of Medicine, 1981
The Discovery of Drug-Induced Illness
New England Journal of Medicine, 1977