Auditing the Unified Medical Language System with Semantic Methods
Open Access
- 1 January 1998
- journal article
- Published by Oxford University Press (OUP) in Journal of the American Medical Informatics Association
- Vol. 5 (1) , 41-51
- https://doi.org/10.1136/jamia.1998.0050041
Abstract
Objective: The National Library of Medicine's (NLM) Unified Medical Language System (UMLS) includes a Metathesaurus (Meta), which is a compilation of medical terms drawn from over 30 controlled vocabularies, and a Semantic Net, which contains the semantic types used to categorize Meta concepts and the semantic relations to connect them. Meta has been constructed through lexical matching techniques and human review. The purpose of this study was to audit the Meta using semantic techniques to identify possible inconsistencies. Methods: Five different techniques were applied: (1) detection of ambiguity in Meta concepts with two or more semantic types, (2) detection of interchangeable keyword synonyms, (3) detection of redundant pairs of Meta concepts (using lexical matching combined with keyword synonyms), (4) detection of inconsistent parent-child relationships in Meta (based on the semantic type information), and (5) discovery of pairs of semantic types for which relations could be added to the Semantic Net, based on “other” relationships between Meta concepts. Results: Of 57,592 concepts with multiple semantic types, 1817 (3.2%) were judged to be ambiguous. Keyword analysis showed 7121 pairs of interchangeable words. Using the keyword pairs, 5031 pairs of potentially redundant concepts were suggested, of which 3274 (65.1%) were judged to actually be redundant. Review of the 100,586 parent-child relationships revealed 544 (0.54%) that were incorrect. Review of the 219,664 “Other” relationships suggested 1299 places in the Semantic Net where relations between pairs of semantic types could be added. Conclusion: Semantic techniques, alone or in combination, can be used to audit the UMLS to detect inconsistencies that are not detectable through lexical techniques alone. Use of these methods to augment the UMLS maintenance process will lead to improvement in the UMLS.Keywords
This publication has 14 references indexed in Scilit:
- Word segmentation processing: a way to exponentially extend medical dictionaries.1995
- A classification manager for compositional concept systems exemplarily shown by the AO/ASIF classification of fractures of long bones.1995
- Designing a Controlled Medical Vocabulary Server: The VOSER ProjectComputers and Biomedical Research, 1994
- The galen projectComputer Methods and Programs in Biomedicine, 1994
- A Logical Foundation for Representation of Clinical DataJournal of the American Medical Informatics Association, 1994
- Knowledge-based Approaches to the Maintenance of a Large Controlled Medical TerminologyJournal of the American Medical Informatics Association, 1994
- Automated translation between medical vocabularies using a frame-based interlingua.1993
- Representation of clinical data using SNOMED III and conceptual graphs.1992
- An interlingua for electronic interchange of medical information: Using frames to map between clinical vocabulariesComputers and Biomedical Research, 1991
- Foundations for an Electronic Medical RecordMethods of Information in Medicine, 1991