An empirical comparison of record linkage procedures
Top Cited Papers
- 26 April 2002
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 21 (10) , 1485-1496
- https://doi.org/10.1002/sim.1147
Abstract
We consider the problem of record linkage in the situation where we have only non‐unique identifiers, like names, sex, race etc., as common identifiers in databases to be linked. For such situations much work on probabilistic methods of record linkage can be found in the statistical literature. However, although many groups undoubtedly still use deterministic procedures, not much literature is available on deterministic strategies. Furthermore, there appears to exist almost no documentation on the comparison of results for the two strategies. In this work we compare a stepwise deterministic linkage strategy with a probabilistic strategy, as implemented in AUTOMATCH, for a situation in which the truth is known. The comparison was carried out on a linkage between medical records from the Regional Perinatal Intensive Care Centers database and educational records from the Florida Department of Education. Social security numbers, available in both databases, were used to decide the true status of each record pair after matching. Match rates and error rates for the two strategies are compared and a discussion of their similarities and differences, strengths and weaknesses is presented. Copyright © 2002 John Wiley & Sons, Ltd.Keywords
This publication has 15 references indexed in Scilit:
- Iterative Automated Record Linkage Using Mixture ModelsJournal of the American Statistical Association, 2001
- A Method for Calibrating False-Match Rates in Record LinkageJournal of the American Statistical Association, 1995
- Matching and Record LinkagePublished by Wiley ,1995
- The Use of Names for Linking Personal RecordsJournal of the American Statistical Association, 1992
- Record Linkage: Statistical Models for Matching Computer RecordsJournal of the Royal Statistical Society Series A: Statistics in Society, 1990
- Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, FloridaJournal of the American Statistical Association, 1989
- A Theory for Record LinkageJournal of the American Statistical Association, 1969
- Record linkageCommunications of the ACM, 1962
- Automatic Linkage of Vital RecordsScience, 1959