EDGAR: Extraction of Drugs, Genes And Relations from the Biomedical Literature
- 1 December 1999
- proceedings article
- Published by World Scientific Pub Co Pte Ltd in Pacific Symposium on Biocomputing
Abstract
EDGAR (Extraction of Drugs, Genes and Relations) is a natural language processing system that extracts information about drugs and genes relevant to cancer from the biomedical literature. This automatically extracted information has remarkable potential to facilitate computational analysis in the molecular biology of cancer, and the technology is straightforwardly generalizable to many areas of biomedicine. This paper reports on the mechanisms for automatically generating such assertions and on a simple application, conceptual clustering of documents. The system uses a stochastic part of speech tagger, generates an underspecified syntactic parse and then uses semantic and pragmatic information to construct its assertions. The system builds on two important existing resources: the MEDLINE database of biomedical citations and abstracts and the Unified Medical Language System, which provides syntactic and semantic information about the terms found in biomedical abstracts.Keywords
This publication has 11 references indexed in Scilit:
- An ontology for bioinformatics applications.Bioinformatics, 1999
- Mining molecular binding terminology from biomedical text.1999
- Constructing biological knowledge bases by extracting information from text sources.1999
- Toward information extraction: identifying protein names from biological papers.1998
- Automatic extraction of keywords from scientific text: application to the knowledge domain of protein families.Bioinformatics, 1998
- Detecting Gene Symbols and Names in Biological Texts: A First Step toward Pertinent Information Extraction.1998
- Developing NLP Tools for Genome Informatics: An Information Extraction Perspective.1998
- The Unified Medical Language System: An Informatics Research CollaborationJournal of the American Medical Informatics Association, 1998
- An Information-Intensive Approach to the Molecular Pharmacology of CancerScience, 1997
- Automatic construction of knowledge base from biological papers.1997