Co-occurrence based meta-analysis of scientific texts: retrieving biological relationships between genes
Open Access
- 18 January 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (9) , 2049-2058
- https://doi.org/10.1093/bioinformatics/bti268
Abstract
Motivation: The advent of high-throughput experiments in molecular biology creates a need for methods to efficiently extract and use information for large numbers of genes. Recently, the associative concept space (ACS) has been developed for the representation of information extracted from biomedical literature. The ACS is a Euclidean space in which thesaurus concepts are positioned and the distances between concepts indicates their relatedness. The ACS uses co-occurrence of concepts as a source of information. In this paper we evaluate how well the system can retrieve functionally related genes and we compare its performance with a simple gene co-occurrence method. Results: To assess the performance of the ACS we composed a test set of five groups of functionally related genes. With the ACS good scores were obtained for four of the five groups. When compared to the gene co-occurrence method, the ACS is capable of revealing more functional biological relations and can achieve results with less literature available per gene. Hierarchical clustering was performed on the ACS output, as a potential aid to users, and was found to provide useful clusters. Our results suggest that the algorithm can be of value for researchers studying large numbers of genes. Availability: The ACS program is available upon request from the authors. Contact:r.jelier@erasmusmc.nlKeywords
This publication has 35 references indexed in Scilit:
- Knowledge discovery by automated identification and ranking of implicit relationshipsBioinformatics, 2004
- Constructing an associative concept space for literature‐based discoveryJournal of the American Society for Information Science and Technology, 2004
- Text mining: Generating hypotheses from MEDLINEJournal of the American Society for Information Science and Technology, 2003
- Mining the Biomedical Literature in the Genomic Era: An OverviewJournal of Computational Biology, 2003
- Generating Hypotheses by Discovering Implicit Associations in the Literature: A Case Report of a Search for New Potential Therapeutic Uses for ThalidomideJournal of the American Medical Informatics Association, 2003
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- Role of Clathrin-mediated Endocytosis in Agonist-induced Down-regulation of the β2-Adrenergic ReceptorJournal of Biological Chemistry, 1998
- An interactive system for finding complementary literatures: a stimulus to scientific discoveryArtificial Intelligence, 1997
- Genetic Localization ofCd63,a Member of the Transmembrane 4 Superfamily, Reveals Two Distinct Loci in the Mouse GenomeGenomics, 1996
- Basic principles of ROC analysisSeminars in Nuclear Medicine, 1978