OntoGene in BioCreative II
Open Access
- 1 September 2008
- journal article
- Published by Springer Nature in Genome Biology
- Vol. 9 (S2) , S13
- https://doi.org/10.1186/gb-2008-9-s2-s13
Abstract
Background: Research scientists and companies working in the domains of biomedicine and genomics are increasingly faced with the problem of efficiently locating, within the vast body of published scientific findings, the critical pieces of information that are needed to direct current and future research investment. Results: In this report we describe approaches taken within the scope of the second BioCreative competition in order to solve two aspects of this problem: detection of novel protein interactions reported in scientific articles, and detection of the experimental method that was used to confirm the interaction. Our approach to the former problem is based on a high-recall protein annotation step, followed by two strict disambiguation steps. The remaining proteins are then combined according to a number of lexico-syntactic filters, which deliver high-precision results while maintaining reasonable recall. The detection of the experimental methods is tackled by a pattern matching approach, which has delivered the best results in the official BioCreative evaluation. Conclusion: Although the results of BioCreative clearly show that no tool is sufficiently reliable for fully automated annotations, a few of the proposed approaches (including our own) already perform at a competitive level. This makes them interesting either as standalone tools for preliminary document inspection, or as modules within an environment aimed at supporting the process of curation of biomedical literature.Keywords
This publication has 35 references indexed in Scilit:
- Overview of BioCreative II gene normalizationGenome Biology, 2008
- Overview of BioCreative II gene mention recognitionGenome Biology, 2008
- Consistent probabilistic outputs for protein function predictionGenome Biology, 2008
- GeneMANIA: a real-time multiple association network integration algorithm for predicting gene functionGenome Biology, 2008
- Predicting gene function in a hierarchical context with an ensemble of classifiersGenome Biology, 2008
- A critical assessment of Mus musculusgene function prediction using integrated genomic evidenceGenome Biology, 2008
- IntAct--open source resource for molecular interaction dataNucleic Acids Research, 2006
- An environment for relation mining over richly annotated corpora: the case of GENIABMC Bioinformatics, 2006
- Lexical adaptation of link grammar to the biomedical sublanguage: a comparative evaluation of three approachesBMC Bioinformatics, 2006
- The Universal Protein Resource (UniProt)Nucleic Acids Research, 2006