Knowledge discovery and data mining to assist natural language understanding.
- 1 January 1998
- journal article
- p. 835-9
Abstract
As natural language processing systems become more frequent in clinical use, methods for interpreting the output of these programs become increasingly important. These methods require the effort of a domain expert, who must build specific queries and rules for interpreting the processor output. Knowledge discovery and data mining tools can be used instead of a domain expert to automatically generate these queries and rules. C5.0, a decision tree generator, was used to create a rule base for a natural language understanding system. A general-purpose natural language processor using this rule base was tested on a set of 200 chest radiograph reports. When a small set of reports, classified by physicians, was used as the training set, the generated rule base performed as well as lay persons, but worse than physicians. When a larger set of reports, using ICD9 coding to classify the set, was used for training the system, the rule base performed worse than the physicians and lay persons. It appears that a larger, more accurate training set is needed to increase performance of the method.This publication has 13 references indexed in Scilit:
- Development and Evaluation of a Computerized Admission Diagnoses Encoding SystemComputers and Biomedical Research, 1996
- Natural Language Processing in Medicine: An OverviewMethods of Information in Medicine, 1996
- Access to Data: Comparing AccessMed With Query by ReviewJournal of the American Medical Informatics Association, 1996
- Artificial Intelligence in Pediatrics: Important Clinical Signs in Newborn SyndromesComputers and Biomedical Research, 1996
- Evaluation of automatic knowledge acquisition techniques in the diagnosis of acute abdominal painArtificial Intelligence in Medicine, 1996
- Unlocking Clinical Data from Narrative Reports: A Study of Natural Language ProcessingAnnals of Internal Medicine, 1995
- A General Natural-language Text Processor for Clinical RadiologyJournal of the American Medical Informatics Association, 1994
- Discordance of Databases Designed for Claims Payment versus Clinical Information Systems: Implications for Outcomes ResearchAnnals of Internal Medicine, 1993
- A Comparison of Logistic Regression to Decision-Tree Induction in a Medical DomainComputers and Biomedical Research, 1993