A method to map heterogeneity between near but non-equivalent semantic attributes in multiple health data registries

Abstract
Health registries from multiple jurisdictions often include terms that are assumed to be semantically equivalent (e.g. fetal death and stillbirth). Closer examination reveals that such attributes have near — but non-equivalent — semantics. Thus their degree of semantic heterogeneity is an important indicator of uncertainty associated with data integration between registries. We build an OWL-encoded ontology which formalizes the relationships between similar perinatal concepts found in different databases. We also introduce the concept of ontology-based metadata as a means of contextualizing such terms and linking context to the attribute data. This extended metadata are exported as XML from the health registries, and it — along with the OWL ontology — is interfaced via a web-based GUI accessible to health researchers. The GUI mapping serves as the basis for making ad hoc comparison and integration decisions. Uncertainty is addressed by precisely mapping semantic heterogeneity between fields.