Robustness of Prevalence Estimates Derived from Misclassified Data from Administrative Databases
- 1 March 2007
- journal article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 63 (1) , 272-279
- https://doi.org/10.1111/j.1541-0420.2006.00665.x
Abstract
Summary Because primary data collection can be expensive, researchers are increasingly using information collected in medical administrative databases for scientific purposes. This information, however, is typically collected for reasons other than research, and many such databases have been shown to contain substantial proportions of misclassification errors. For example, many administrative databases contain fields for patient diagnostic codes, but these are often missing or inaccurate, in part because physician reimbursement schemes depend on medical acts performed rather than any diagnosis. Errors in ascertaining which individuals have a given disease bias not only prevalence estimates, but also estimates of associations between the disease and other variables, such as medication use. We attempt to estimate the prevalence of osteoarthritis (OA) among elderly Quebeckers using a government administrative database. We compare a naive estimate relying solely on the physician diagnoses of OA listed in the database to estimates from several different Bayesian latent class models which adjust for misclassified physician diagnostic codes via use of other available diagnostic clues. We find that the prevalence estimates vary widely, depending on the model used and assumptions made. We conclude that any inferences from these databases need to be interpreted with great caution, until further work estimating the reliability of database items is carried out.Keywords
This publication has 33 references indexed in Scilit:
- Bayesian modelling of imperfect ascertainment methods in cancer studiesStatistics in Medicine, 2005
- On Model Expansion, Model Contraction, Identifiability and Prior Information: Two Illustrative Scenarios Involving Mismeasured VariablesStatistical Science, 2005
- Assessing accuracy of diagnosis-type indicators for flagging complications in administrative dataJournal of Clinical Epidemiology, 2004
- Modelling risk when binary outcomes are subject to errorStatistics in Medicine, 2004
- Estimating disease prevalence in the absence of a gold standardStatistics in Medicine, 2002
- Can Administrative Data Be Used to Ascertain Clinically Significant Postoperative Complications?American Journal of Medical Quality, 2002
- Bayesian Sample Size Determination for Estimating Binomial Parameters from Data Subject to MisclassificationJournal of the Royal Statistical Society Series C: Applied Statistics, 2000
- The use of prescription claims databases in pharmacoepidemiological research: The accuracy and comprehensiveness of the prescription claims database in QuébecPublished by Elsevier ,1995
- How Accurate are Hospital Discharge Data for Evaluating Effectiveness of Care?Medical Care, 1993
- Estimation of test error rates, disease prevalence and relative risk from misclassified data: a reviewJournal of Clinical Epidemiology, 1988