Bayesian inference of protein–protein interactions from biological literature
Open Access
- 15 April 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (12) , 1536-1542
- https://doi.org/10.1093/bioinformatics/btp245
Abstract
Motivation: Protein–protein interaction (PPI) extraction from published biological articles has attracted much attention because of the importance of protein interactions in biological processes. Despite significant progress, mining PPIs from literatures still rely heavily on time- and resource-consuming manual annotations. Results: In this study, we developed a novel methodology based on Bayesian networks (BNs) for extracting PPI triplets (a PPI triplet consists of two protein names and the corresponding interaction word) from unstructured text. The method achieved an overall accuracy of 87% on a cross-validation test using manually annotated dataset. We also showed, through extracting PPI triplets from a large number of PubMed abstracts, that our method was able to complement human annotations to extract large number of new PPIs from literature. Availability: Programs/scripts we developed/used in the study are available at http://stat.fsu.edu/~jinfeng/datasets/Bio-SI-programs-Bayesian-chowdhary-zhang-liu.zip Contact:jliu@stat.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 39 references indexed in Scilit:
- Overview of the protein-protein interaction annotation extraction task of BioCreative IIGenome Biology, 2008
- Computational Biology Resources Lack Persistence and UsabilityPLoS Computational Biology, 2008
- PIE: an online prediction system for protein-protein interactions from textNucleic Acids Research, 2008
- Protein interactions and disease: computational approaches to uncover the etiology of diseasesBriefings in Bioinformatics, 2007
- Combination of text-mining algorithms increases the performanceBioinformatics, 2006
- BioGRID: a general repository for interaction datasetsNucleic Acids Research, 2006
- Beyond the clause: extraction of phosphorylation information from medline abstractsBioinformatics, 2005
- The MIPS mammalian protein–protein interaction databaseBioinformatics, 2004
- GENIES: a natural-language processing system for the extraction of molecular pathways from journal articlesBioinformatics, 2001
- EVENT EXTRACTION FROM BIOMEDICAL PAPERS USING A FULL PARSERPacific Symposium on Biocomputing, 2000