Recognition of protein/gene names from text using an ensemble of classifiers
Open Access
- 24 May 2005
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (S1) , S7
- https://doi.org/10.1186/1471-2105-6-s1-s7
Abstract
This paper proposes an ensemble of classifiers for biomedical name recognition in which three classifiers, one Support Vector Machine and two discriminative Hidden Markov Models, are combined effectively using a simple majority voting strategy. In addition, we incorporate three post-processing modules, including an abbreviation resolution module, a protein/gene name refinement module and a simple dictionary matching module, into the system to further improve the performance. Evaluation shows that our system achieves the best performance from among 10 systems with a balanced F-measure of 82.58 on the closed evaluation of the BioCreative protein/gene name recognitiontask (Task 1A).Keywords
This publication has 8 references indexed in Scilit:
- Recognizing names in biomedical texts: a machine learning approachBioinformatics, 2004
- Two-phase biomedical NE recognition based on SVMsPublished by Association for Computational Linguistics (ACL) ,2003
- Effective adaptation of a Hidden Markov Model-based named entity recognizer for biomedical domainPublished by Association for Computational Linguistics (ACL) ,2003
- A SIMPLE ALGORITHM FOR IDENTIFYING ABBREVIATION DEFINITIONS IN BIOMEDICAL TEXTPacific Symposium on Biocomputing, 2002
- Tuning support vector machines for biomedical named entity recognitionPublished by Association for Computational Linguistics (ACL) ,2002
- Improving retrieval performance by relevance feedbackJournal of the American Society for Information Science, 1990
- Error bounds for convolutional codes and an asymptotically optimum decoding algorithmIEEE Transactions on Information Theory, 1967
- Prediction and Entropy of Printed EnglishBell System Technical Journal, 1951