Biomedical term mapping databases
Open Access
- 17 December 2004
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 33 (Database ) , D289-D293
- https://doi.org/10.1093/nar/gki137
Abstract
Longer words and phrases are frequently mapped onto a shorter form such as abbreviations or acronyms for efficiency of communication. These abbreviations are pervasive in all aspects of biology and medicine and as the amount of biomedical literature grows, so does the number of abbreviations and the average number of definitions per abbreviation. Even more confusing, different authors will often abbreviate the same word/phrase differently. This ambiguity impedes our ability to retrieve information, integrate databases and mine textual databases for content. Efforts to standardize nomenclature, especially those doing so retrospectively, need to be aware of different abbreviatory mappings and spelling variations. To address this problem, there have been several efforts to develop computer algorithms to identify the mapping of terms between short and long form within a large body of literature. To date, four such algorithms have been applied to create online databases that comprehensively map biomedical terms and abbreviations within MEDLINE: ARGH (http://lethargy.swmed.edu/ARGH/argh.asp), the Stanford Biomedical Abbreviation Server (http://bionlp.stanford.edu/abbreviation/), AcroMed (http://medstract.med.tufts.edu/acro1.1/index.htm) and SaRAD (http://www.hpl.hp.com/research/idl/projects/abbrev.html). In addition to serving as useful computational tools, these databases serve as valuable references that help biologists keep up with an ever-expanding vocabulary of terms.Keywords
This publication has 18 references indexed in Scilit:
- Gene name ambiguity of eukaryotic nomenclaturesBioinformatics, 2004
- The computational analysis of scientific literature to define and recognize gene expression clustersNucleic Acids Research, 2003
- Accomplishments and challenges in literature data mining for biologyBioinformatics, 2002
- A SIMPLE ALGORITHM FOR IDENTIFYING ABBREVIATION DEFINITIONS IN BIOMEDICAL TEXTPacific Symposium on Biocomputing, 2002
- Creating an Online Dictionary of Abbreviations from MEDLINEJournal of the American Medical Informatics Association, 2002
- Text-based knowledge discovery: search and mining of life-sciences documentsDrug Discovery Today, 2002
- Heuristics for Identification of Acronym-Definition Patterns within Text: Towards an Automated Construction of Comprehensive Acronym-Definition DictionariesMethods of Information in Medicine, 2002
- A literature network of human genes for high-throughput analysis of gene expressionNature Genetics, 2001
- Automatic extraction of acronym-meaning pairs from MEDLINE databases.2001
- PNAD-CSS: a workbench for constructing a protein name abbreviation dictionaryBioinformatics, 2000