Biomedical data integration: using XML to link clinical and research data sets
- 1 May 2005
- journal article
- review article
- Published by Taylor & Francis in Expert Review of Molecular Diagnostics
- Vol. 5 (3) , 329-336
- https://doi.org/10.1586/14737159.5.3.329
Abstract
Data integration occurs when a query proceeds through multiple data sets, thereby relating diverse data extracted from different data sources. Data integration is particularly important to biomedical researchers since data obtained from experiments on human tissue specimens have little applied value unless they can be combined with medical data (i.e., pathologic and clinical information). In the past, research data were correlated with medical data by manually retrieving, reading, assembling and abstracting patient charts, pathology reports, radiology reports and the results of special tests and procedures. Manual annotation of research data is impractical when experiments involve hundreds or thousands of tissue specimens resulting in large, complex data collections. The purpose of this paper is to review how XML (eXtensible Markup Language) provides the fundamental tools that support biomedical data integration. The article also discusses some of the most important challenges that block the widespread availability of annotated biomedical data sets.Keywords
This publication has 30 references indexed in Scilit:
- A Multigene Assay to Predict Recurrence of Tamoxifen-Treated, Node-Negative Breast CancerNew England Journal of Medicine, 2004
- Biomarker Boom Slowed by Validation ConcernsJNCI Journal of the National Cancer Institute, 2004
- Bioinformatics integration and agent technologyJournal of Biomedical Informatics, 2004
- How (not) to protect genomic data privacy in a distributed network: using trail re-identification to evaluate and design anonymity protection systemsJournal of Biomedical Informatics, 2004
- Racing to Share Pathology DataAmerican Journal of Clinical Pathology, 2004
- Integrating biological databasesNature Reviews Genetics, 2003
- The Human Plasma ProteomeMolecular & Cellular Proteomics, 2002
- Pharmacogenomics: The promise of personalized medicineAAPS PharmSci, 2000
- Cryptographic protection of health information: cost and benefitInternational Journal of Bio-Medical Computing, 1996
- Unlocking Clinical Data from Narrative Reports: A Study of Natural Language ProcessingAnnals of Internal Medicine, 1995