PPRODO: Prediction of protein domain boundaries using neural networks
- 23 March 2005
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 59 (3) , 627-632
- https://doi.org/10.1002/prot.20442
Abstract
Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of multidomain proteins but also for the experimental structure determination. Since protein sequences of multiple domains may contain much information regarding evolutionary processes such as gene–exon shuffling, this information can be detected by analyzing the position-specific scoring matrix (PSSM) generated by PSI-BLAST. We have presented a method, PPRODO (Prediction of PROtein DOmain boundaries) that predicts domain boundaries of proteins from sequence information by a neural network. The network is trained and tested using the values obtained from the PSSM generated by PSI-BLAST. A 10-fold cross-validation technique is performed to obtain the parameters of neural networks using a nonredundant set of 522 proteins containing 2 contiguous domains. PPRODO provides good and consistent results for the prediction of domain boundaries, with accuracy of about 66% using the ±20 residue criterion. The PPRODO source code, as well as all data sets used in this work, are available from http://gene.kias.re.kr/∼jlee/pprodo/. Proteins 2005.Keywords
This publication has 28 references indexed in Scilit:
- SCOP: A structural classification of proteins database for the investigation of sequences and structuresPublished by Elsevier ,2006
- Protein domain identification and improved sequence similarity searching using PSI‐BLASTProteins-Structure Function and Bioinformatics, 2002
- Getting the most from PSI–BLASTPublished by Elsevier ,2002
- SnapDRAGON: a method to delineate protein structural domains from sequence dataJournal of Molecular Biology, 2002
- Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von HeijneJournal of Molecular Biology, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- A Model Recognition Approach to the Prediction of All-Helical Membrane Protein Structure and TopologyBiochemistry, 1994
- Empirical Scale of Side-Chain Conformational Entropy in Protein FoldingJournal of Molecular Biology, 1993
- A simple method for displaying the hydropathic character of a proteinJournal of Molecular Biology, 1982