MutationFinder: a high-performance system for extracting point mutation mentions from text
Open Access
- 11 May 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (14) , 1862-1865
- https://doi.org/10.1093/bioinformatics/btm235
Abstract
Summary: Discussion of point mutations is ubiquitous in biomedical literature, and manually compiling databases or literature on mutations in specific genes or proteins is tedious. We present an open-source, rule-based system, MutationFinder, for extracting point mutation mentions from text. On blind test data, it achieves nearly perfect precision and a markedly improved recall over a baseline. Availability: MutationFinder, along with a high-quality gold standard data set, and a scoring script for mutation extraction systems have been made publicly available. Implementations, source code and unit tests are available in Python, Perl and Java. MutationFinder can be used as a stand-alone script, or imported by other applications. Project URL:http://bionlp.sourceforge.net Contact:gregcaporaso@gmail.comKeywords
This publication has 6 references indexed in Scilit:
- OSIRIS: a tool for retrieving literature about sequence variantsBioinformatics, 2006
- Mutation Mining—A Prospector's TaleInformation Systems Frontiers, 2006
- BioCreAtIvE Task 1A: gene mention finding evaluationBMC Bioinformatics, 2005
- Automated extraction of mutation data from the literature: application of MuteXt to G protein-coupled receptors and nuclear hormone receptorsBioinformatics, 2004
- Automatic extraction of mutations from Medline and cross-validation with OMIMNucleic Acids Research, 2004
- Disambiguating proteins, genes, and RNA in text: a machine learning approachBioinformatics, 2001