Shrinkage observed-to-expected ratios for robust and transparent large-scale pattern discovery
Top Cited Papers
Open Access
- 24 June 2011
- journal article
- Published by SAGE Publications in Statistical Methods in Medical Research
- Vol. 22 (1) , 57-69
- https://doi.org/10.1177/0962280211403604
Abstract
Large observational data sets are a great asset to better understand the effects of medicines in clinical practice and, ultimately, improve patient care. For an empirical pattern in observational data to be of practical relevance, it should represent a substantial deviation from the null model. For the purpose of identifying such deviations, statistical significance tests are inadequate, as they do not on their own distinguish the magnitude of an effect from its data support. The observed-to-expected (OE) ratio on the other hand directly measures strength of association and is an intuitive basis to identify a range of patterns related to event rates, including pairwise associations, higher order interactions and temporal associations between events over time. It is sensitive to random fluctuations for rare events with low expected counts but statistical shrinkage can protect against spurious associations. Shrinkage OE ratios provide a simple but powerful framework for large-scale pattern discovery. In this article, we outline a range of patterns that are naturally viewed in terms of OE ratios and propose a straightforward and effective statistical shrinkage transformation that can be applied to any such ratio. The proposed approach retains emphasis on the practical relevance and transparency of highlighted patterns, while protecting against spurious associations.Keywords
This publication has 23 references indexed in Scilit:
- Temporal pattern discovery in longitudinal electronic patient recordsData Mining and Knowledge Discovery, 2009
- A statistical methodology for drug–drug interaction surveillanceStatistics in Medicine, 2008
- Large-Scale Bayesian Logistic Regression for Text CategorizationTechnometrics, 2007
- Accounting for Multiplicity in the Evaluation of “Signals” Obtained by Data Mining from Spontaneous Report Adverse Event DatabasesBiometrical Journal, 2007
- Data Mining in PharmacovigilanceInternational Journal of Pharmaceutical Medicine, 2007
- Extending the methods used to screen the WHO drug safety database towards analysis of complex associations and improved accuracy for rare eventsStatistics in Medicine, 2005
- Selective serotonin reuptake inhibitors in pregnant women and neonatal withdrawal syndrome: a database analysisThe Lancet, 2005
- Pattern Discovery and Detection: A Unified Statistical MethodologyJournal of Applied Statistics, 2004
- Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reportsPharmacoepidemiology and Drug Safety, 2001
- Statistical significance tests.British Journal of Clinical Pharmacology, 1982