NLP-based identification of pneumonia cases from free-text radiological reports.
- 6 November 2008
- journal article
- Vol. 2008, 172-6
Abstract
Radiological reports are a rich source of clinical data which can be mined to assist with biosurveillance of emerging infectious diseases. In addition to biosurveillance, radiological reports are an important source of clinical data for health service research.Pneumonias and other radiological findings on chest x ray or chest computed tomography (CT) are one type of relevant finding to both biosurveillance and health services research. In this study we examined the ability of a Natural Language Processing system to accurately identify pneumonias and other lesions from within free text radiological reports. The system encoded the reports in the SNOMED CT Ontology and then a set of SNOMED CT based rules were created in our Health Archetype Language aimed at the identification of these radiological findings and diagnoses. The encoded rule was executed against the SNOMED CT encodings of the radiological reports. The accuracy of the reports was compared with a Clinician review of the Radiological Reports. The accuracy of the system in the identification of pneumonias was high with a Sensitivity (recall) of 100%, a specificity of 98%, and a positive predictive value (precision) of 97%. We conclude that SNOMED CT based computable rules are accurate enough for the automated biosurveillance of pneumonias from radiological reports.This publication has 21 references indexed in Scilit:
- A pilot study of contextual UMLS indexing to improve the precision of concept-based representation in XML-structured clinical radiology reportsJournal of the American Medical Informatics Association, 2003
- Finding UMLS Metathesaurus concepts in MEDLINE.2002
- A Comparison of Classification Algorithms to Automatically Identify Chest X-Ray Reports That Support PneumoniaJournal of Biomedical Informatics, 2001
- The interface between information, terminology, and inference models.2001
- Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.2001
- Integrating existing drug formulation terminologies into an HL7 standard classification using OpenGALEN.2001
- The NLM Indexing Initiative.2000
- Automated indexing for full text information retrieval.2000
- Clinical Terminology: Why is it so hard?Published by Japan Society of Histochemistry & Cytochemistry ,1999
- Galen: a third generation terminology tool to support a multipurpose national coding system for surgical procedures.1999