Investigation into biomedical literature classification using support vector machines
- 1 January 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 366-374
- https://doi.org/10.1109/csb.2005.36
Abstract
Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.Keywords
This publication has 9 references indexed in Scilit:
- Text Mining Biomedical Literature for Discovering Gene-to-Gene Relationships: A Comparative Study of AlgorithmsIEEE/ACM Transactions on Computational Biology and Bioinformatics, 2005
- Learning to Classify Text Using Support Vector MachinesPublished by Springer Nature ,2002
- Hierarchical classification of Web contentPublished by Association for Computing Machinery (ACM) ,2000
- A re-examination of text categorization methodsPublished by Association for Computing Machinery (ACM) ,1999
- Support vector machines for spam categorizationIEEE Transactions on Neural Networks, 1999
- Inductive learning algorithms and representations for text categorizationPublished by Association for Computing Machinery (ACM) ,1998
- The Nature of Statistical Learning TheoryPublished by Springer Nature ,1995
- Term-weighting approaches in automatic text retrievalInformation Processing & Management, 1988
- An algorithm for suffix strippingProgram: electronic library and information systems, 1980