Finding genomic ontology terms in text using evidence content

Open Access

24 May 2005

journal article
Published by Springer Nature in BMC Bioinformatics

Vol. 6 (S1) , 1-S21
https://doi.org/10.1186/1471-2105-6-s1-s21

Abstract

Background: The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These systems need first to recognize the biological entities and properties in the text, and then decide which pairs represent valid annotations. Methods: This document introduces a novel unsupervised method for recognizing biological properties in unstructured text, involving the evidence content of their names. Results: This document shows the results obtained by the application of our method to BioCreative tasks 2.1 and 2.2, where it identified Gene Ontology annotations and their evidence in a set of articles. Conclusion: From the performance obtained in BioCreative, we concluded that an automatic annotation system can effectively use our method to identify biological properties in unstructured text.

Keywords

This publication has 7 references indexed in Scilit:

Classifying biological articles using web resources
Published by Association for Computing Machinery (ACM) ,2004
The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology
Nucleic Acids Research, 2004
Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup
Bioinformatics, 2003
Accomplishments and challenges in literature data mining for biology
Bioinformatics, 2002
Information extraction in molecular biology
Briefings in Bioinformatics, 2002
Automated extraction of information in molecular biology
FEBS Letters, 2000
Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language
Journal of Artificial Intelligence Research, 1999