Identifying gene and protein mentions in text using conditional random fields
Open Access
- 24 May 2005
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (S1) , S6
- https://doi.org/10.1186/1471-2105-6-s1-s6
Abstract
We present a model for tagging gene and protein mentions from text using the probabilistic sequence tagging framework of conditional random fields (CRFs). Conditional random fields model the probability P(t|o) of a tag sequence given an observation sequence directly, and have previously been employed successfully for other tagging tasks. The mechanics of CRFs and their relationship to maximum entropy are discussed in detail.Keywords
This publication has 7 references indexed in Scilit:
- A critical assessment of text mining methods in molecular biology. Proceedings of a workshop. March 28-31, 2004. Granada, Spain.2005
- A BIOLOGICAL NAMED ENTITY RECOGNIZERPacific Symposium on Biocomputing, 2002
- Tagging gene and protein names in biomedical textBioinformatics, 2002
- A survey of smoothing techniques for ME modelsIEEE Transactions on Speech and Audio Processing, 2000
- Numerical OptimizationPublished by Springer Nature ,1999
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989
- Generalized Iterative Scaling for Log-Linear ModelsThe Annals of Mathematical Statistics, 1972