Text Mining for Metabolic Pathways, Signaling Cascades, and Protein Networks
- 10 May 2005
- journal article
- review article
- Published by American Association for the Advancement of Science (AAAS) in Science's STKE
- Vol. 2005 (283) , pe21
- https://doi.org/10.1126/stke.2832005pe21
Abstract
The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.Keywords
This publication has 16 references indexed in Scilit:
- BioCreAtIvE Task 1A: gene mention finding evaluationBMC Bioinformatics, 2005
- EcoCyc: a comprehensive database resource for Escherichia coliNucleic Acids Research, 2004
- A gene network for navigating the literatureNature Genetics, 2004
- FatiGO: a web tool for finding significant associations of Gene Ontology terms with groups of genesBioinformatics, 2004
- IntAct: an open source molecular interaction databaseNucleic Acids Research, 2004
- The KEGG resource for deciphering the genomeNucleic Acids Research, 2004
- The Gene Ontology (GO) database and informatics resourceNucleic Acids Research, 2004
- Life cycles of successful genesTrends in Genetics, 2003
- A literature network of human genes for high-throughput analysis of gene expressionNature Genetics, 2001
- Mining functional information associated with expression arraysFunctional & Integrative Genomics, 2001