Probabilistic Record Linkage: Relationships between File Sizes, Identifiers, and Match Weights
- 1 January 2001
- journal article
- research article
- Published by Georg Thieme Verlag KG in Methods of Information in Medicine
- Vol. 40 (03) , 196-203
- https://doi.org/10.1055/s-0038-1634155
Abstract
This study investigates relationships between file sizes, amounts of information contained in commonly used record linkage variables, and the amount of information needed for a successful probabilistic linkage project. We present an equation predicting the amount of information needed for a successful linkage project. Match weights for variables commonly used in record linkage are measured using artificially created databases. Linkage algorithms were successful when the sum of minimum weights for variables used in a linkage exceeded the predicted cutoff. Linkage results were acceptable when this sum was near the predicted cutoff. This technique enables researchers to determine if enough information exists to perform a successful probabilistic linkage.Keywords
This publication has 8 references indexed in Scilit:
- Determining First Admissions in a Hospital Discharge File via Record LinkageMethods of Information in Medicine, 1998
- Effects of Record Linkage Errors on Disease RegistrationMethods of Information in Medicine, 1998
- Effects of record linkage errors on registry-based follow-up studiesStatistics in Medicine, 1997
- Linking Large Administrative Databases: A Method for Conducting Emergency Medical Services Cohort Studies Using Existing DataAcademic Emergency Medicine, 1997
- The Potential of Using Billing Data for Emergency Department Injury SurveillanceAcademic Emergency Medicine, 1997
- Probabilistic linkage of large public health data filesStatistics in Medicine, 1995
- Handbook of Record Linkage: Methods for Health and Statistical Studies, Administration, and Business.Journal of the American Statistical Association, 1989
- Advances in Record-Linkage Methodology as Applied to Matching the 1985 Census of Tampa, FloridaJournal of the American Statistical Association, 1989