Geocoding Addresses from a Large Population-based Study: Lessons Learned

Abstract
Background Geographic information systems (GIS) and spatial statistics are useful for exploring the relation between geographic location and health. The ultimate usefulness of GIS depends on both completeness and accuracy of geocoding (the process of assigning study participants’ residences latitude/longitude coordinates that closely approximate their true locations, also known as address matching). The goal of this project was to develop an iterative geocoding process that would achieve a high match rate in a large population-based health study. Methods Data were from a study conducted in Wisconsin using mailing addresses of participants who were interviewed by telephone from 1988 to 1995. We standardized the addresses according to US Postal Service guidelines, used desktop GIS geocoding software and two versions of the Topologically Integrated Geographic Encoding and Referencing street maps, accessed Internet mapping engines for problematic addresses, and recontacted a small number of study participants’ households. We also tabulated the project’s cost, time commitment, software requirements, and brief notes for each step and their alternatives. Results Of the 14,804 participants, 97% were ultimately assigned latitude/longitude coordinates corresponding to their respective residences. The remaining 3% were geocoded to their zip code centroid. Conclusion The multiple methods described in this work provide practical information for investigators who are considering the use of GIS in their population health research.