Biomedical data integration: using XML to link clinical and research data sets

Abstract
Data integration occurs when a query proceeds through multiple data sets, thereby relating diverse data extracted from different data sources. Data integration is particularly important to biomedical researchers since data obtained from experiments on human tissue specimens have little applied value unless they can be combined with medical data (i.e., pathologic and clinical information). In the past, research data were correlated with medical data by manually retrieving, reading, assembling and abstracting patient charts, pathology reports, radiology reports and the results of special tests and procedures. Manual annotation of research data is impractical when experiments involve hundreds or thousands of tissue specimens resulting in large, complex data collections. The purpose of this paper is to review how XML (eXtensible Markup Language) provides the fundamental tools that support biomedical data integration. The article also discusses some of the most important challenges that block the widespread availability of annotated biomedical data sets.

This publication has 30 references indexed in Scilit: