CODA: A combined algorithm for predicting the structurally variable regions of protein models

1 March 2001

journal article
research article
Published by Wiley in Protein Science

Vol. 10 (3) , 599-612
https://doi.org/10.1110/ps.37601

Abstract

CODA, an algorithm for predicting the variable regions in proteins, combines FREAD a knowledge based approach, and PETRA, which constructs the region ab initio. FREAD selects from a database of protein structure fragments with environmentally constrained substitution tables and other rule-based filters. FREAD was parameterized and tested on over 3000 loops. The average root mean square deviation ranged from 0.78 Angstrom for three residue loops to 3.5 Angstrom for eight residue loops on a nonhomologous test set. CODA clusters the predictions from the two independent programs and makes a consensus prediction that must pass a set of rule-based filters. CODA was parameterized and tested on two unrelated separate sets of structures that were nonhomologous to one another and those found in the FREAD database. The average root mean square deviation in the test see ranged from 0.76 Angstrom for three residue loops to 3.09 Angstrom for eight residue loops. CODA shows a general improvement in loop prediction over PETRA and FREAD individually. The improvement is far more marked for lengths six and upward, probably as the predictive power of PETRA becomes more important. CODA was further tested on several model structures to determine its applicability to the modeling situation. A web server of CODA is available at http://www-cryst.bioc.cam.ac.uk/-charlotte/Coda/search_coda.html.

Keywords

This publication has 67 references indexed in Scilit:

The Protein Data Bank
Nucleic Acids Research, 2000
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Enlarged representative set of protein structures
Protein Science, 1994
Comparative Protein Modelling by Satisfaction of Spatial Restraints
Journal of Molecular Biology, 1993
Modeling the anti‐CEA antibody combining site by homology and conformational search
Proteins-Structure Function and Bioinformatics, 1992
One thousand families for the molecular biologist
Nature, 1992
Selection of representative protein data sets
Protein Science, 1992
Canonical structures for the hypervariable regions of immunoglobulins
Journal of Molecular Biology, 1987
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features
Biopolymers, 1983
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations
Journal of Computational Chemistry, 1983