Disambiguation Data: Extracting Information from Anonymized Sources
Open Access
- 1 November 2002
- journal article
- Published by Oxford University Press (OUP) in Journal of the American Medical Informatics Association
- Vol. 9 (90061) , 110S-114
- https://doi.org/10.1197/jamia.M1240
Abstract
Privacy protection is an important consideration when releasing medical databases to the research community. We show that while recent advances in anonymization algorithms provide increased levels of protection, it is still possible to calculate approximations to the original data set. In some cases, one can even uniquely reconstruct entries in a table before anonymization. In this paper, we demonstrate how knowledge of an anonymization algorithm based on ambiguating data cell entries can be used to undo the anonymization process. We investigate the effect of this algorithm and its reversal on data sets of varying sizes and distributions. It is shown that by using a computationally complex disambiguation process, information on individuals can be extracted from an anonymized data set.Keywords
This publication has 11 references indexed in Scilit:
- Protection of privacy by third-party encryption in genetic research in IcelandEuropean Journal of Human Genetics, 2000
- Medical Information Privacy and the Conduct of Biomedical ResearchAcademic Medicine, 2000
- Driving Toward Guiding Principles: A Goal for Privacy, Confidentiality, and Security of Health InformationJournal of the American Medical Informatics Association, 1999
- Meta-analysis: formulating, evaluating, combining, and reportingStatistics in Medicine, 1999
- Models and algorithms for the 2-dimensional cell suppression problem in statistical disclosure controlMathematical Programming, 1999
- Health Care Information and the Protection of Personal Privacy: Ethical and Legal ConsiderationsAnnals of Internal Medicine, 1997
- Weaving Technology and Policy Together to Maintain ConfidentialityJournal of Law, Medicine & Ethics, 1997
- Meta‐analysis: Weighing the evidenceStatistics in Medicine, 1995
- Controlling FD and MVD inferences in multilevel relational database systemsIEEE Transactions on Knowledge and Data Engineering, 1991
- Privacy protection and population-based health researchSocial Science & Medicine, 1986