Three approaches to automatic assignment of ICD-9-CM codes to radiology reports.
- 11 October 2007
- journal article
- research article
- Vol. 2007, 279-83
Abstract
We describe and evaluate three systems for automatically predicting the ICD-9-CM codes of radiology reports from short excerpts of text. The first system benefits from an open source search engine, Lucene, and takes advantage of the relevance of reports to one another based on individual words. The second uses BoosTexter, a boosting algorithm based on n-grams (sequences of consecutive words) and s-grams (sequences of non-consecutive words) extracted from the reports. The third employs a set of hand-crafted rules that capture lexical elements (short, meaningful, strings of words) derived from BoosTexter's n-grams, and that are enhanced by shallow semantic information in the form of negation, synonymy, and uncertainty. Our evaluation shows that semantic information significantly contributes to ICD-9-CM coding with lexical elements. Also, a simple hand-crafted rule-based system with lexical elements and semantic information can outperform algorithmically more complex systems, such as Lucene and BoosTexter, when these systems base their ICD-9-CM predictions only upon individual words, n-grams, or s grams.This publication has 12 references indexed in Scilit:
- Syntactically-informed semantic category recognition in discharge summaries.2006
- Indexing UMLS Semantic Types for Medical Question-Answering.2005
- Extracting diagnoses from discharge summaries.2005
- Automated Encoding of Clinical Documents Based on Natural Language ProcessingJournal of the American Medical Informatics Association, 2004
- Context-sensitive medical information retrieval.2004
- An applied evaluation of SNOMED CT as a clinical vocabulary for the computerized diagnosis and problem list.2003
- A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge SummariesJournal of Biomedical Informatics, 2001
- Selective Automated Indexing of Findings and Diagnoses in Radiology ReportsJournal of Biomedical Informatics, 2001
- Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.2001
- UMLS Concept Indexing for Production Databases: A Feasibility StudyJournal of the American Medical Informatics Association, 2001