Disulfide connectivity prediction using secondary structure information and diresidue frequencies
Open Access
- 1 March 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (10) , 2336-2346
- https://doi.org/10.1093/bioinformatics/bti328
Abstract
Motivation: We describe a stand-alone algorithm to predict disulfide bond partners in a protein given only the amino acid sequence, using a novel neural network architecture (the diresidue neural network), and given input of symmetric flanking regions of N-terminus and C-terminus half-cystines augmented with residue secondary structure (helix, coil, sheet) as well as evolutionary information. The approach is motivated by the observation of a bias in the secondary structure preferences of free cysteines and half-cystines, and by promising preliminary results we obtained using diresidue position-specific scoring matrices. Results: As calibrated by receiver operating characteristic curves from 4-fold cross-validation, our conditioning on secondary structure allows our novel diresidue neural network to perform as well as, and in some cases better than, the current state-of-the-art method. A slight drop in performance is seen when secondary structure is predicted rather than being derived from three-dimensional protein structures. Availability:http://clavius.bc.edu/~clotelab/DiANNA Contact:clote@bc.edu Supplementary information: Supplementary tables and figures, and the complete list of PDB codes of monomers used, can be found at http://clavius.bc.edu/~clotelab/Keywords
This publication has 21 references indexed in Scilit:
- SCOP database in 2004: refinements integrate structure and sequence family dataNucleic Acids Research, 2004
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- The Protein Data BankActa Crystallographica Section D-Biological Crystallography, 2002
- Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factorsNucleic Acids Research, 2002
- Prediction of disulfide connectivity in proteinsBioinformatics, 2001
- Predicting the oxidation state of cysteines by multiple sequence alignmentBioinformatics, 2000
- Role of evolutionary information in predicting the disulfide-bonding state of cysteine in proteinsProteins-Structure Function and Bioinformatics, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Different sequence environments of cysteines and half cystines in proteins Application to predict disulfide forming residuesFEBS Letters, 1992
- Principles that Govern the Folding of Protein ChainsScience, 1973