A study of biomedical concept identification: MetaMap vs. people.
- 1 January 2003
- journal article
- research article
- Vol. 2003, 529-33
Abstract
Although huge amounts of unstructured text are available as a rich source of biomedical knowledge, to process this unstructured knowledge requires tools that identify concepts from free-form text. MetaMap is one tool that system developers in biomedicine have commonly used for such a task, but few have studied how well it accomplishes this task in general. In this paper, we report on a study that compares MetaMap's performance against that of six people. Such studies are challenging because the task is inherently subjective and establishing consensus is difficult. Nonetheless, for those concepts that subjects generally agreed on, MetaMap was able to identify most concepts, if they were represented in the UMLS. However, MetaMap identified many other concepts that peo-ple did not. We also report on our analysis of the types of failures that MetaMap exhibited as well as trends in the way people chose to identify concepts.This publication has 11 references indexed in Scilit:
- GENIES: a natural-language processing system for the extraction of molecular pathways from journal articlesBioinformatics, 2001
- Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.2001
- QueryCat: automatic categorization of MEDLINE queries.2000
- Text-based discovery in biomedicine: the architecture of the DAD-system.2000
- A Reliability Study for Evaluating Information Extraction from Radiology ReportsJournal of the American Medical Informatics Association, 1999
- Mining molecular binding terminology from biomedical text.1999
- Evaluating Natural Language Processors in the Clinical DomainMethods of Information in Medicine, 1998
- Identification of anatomical terminology in medical text.1998
- Query expansion using the UMLS Metathesaurus.1997
- Natural Language Processing and the Representation of Clinical DataJournal of the American Medical Informatics Association, 1994