Information extraction from full text scientific articles: Where are the keywords?
Top Cited Papers
Open Access
- 29 May 2003
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 4 (1) , 20
- https://doi.org/10.1186/1471-2105-4-20
Abstract
To date, many of the methods for information extraction of biological information from scientific articles are restricted to the abstract of the article. However, full text articles in electronic version, which offer larger sources of data, are currently available. Several questions arise as to whether the effort of scanning full text articles is worthy, or whether the information that can be extracted from the different sections of an article can be relevant. In this work we addressed those questions showing that the keyword content of the different sections of a standard scientific article (abstract, introduction, methods, results, and discussion) is very heterogeneous. Although the abstract contains the best ratio of keywords per total of words, other sections of the article may be a better source of biologically relevant data.Keywords
This publication has 16 references indexed in Scilit:
- Information extraction from full text scientific articles: Where are the keywords?BMC Bioinformatics, 2003
- Getting to the (c)ore of knowledge: mining biomedical literatureInternational Journal of Medical Informatics, 2002
- Public-access group supports PubMed CentralNature, 2002
- Tagging gene and protein names in biomedical textBioinformatics, 2002
- Computing Fuzzy Associations for the Analysis of Biological LiteratureBioTechniques, 2002
- The complexity of comparing reaction systemsBioinformatics, 2002
- Renaming Genes and Duplication of Gene Names in the LiteraturePlant Cell, 2001
- A literature network of human genes for high-throughput analysis of gene expressionNature Genetics, 2001
- PubMed Central: The GenBank of the published literatureProceedings of the National Academy of Sciences, 2001
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000