Proteins associated with diseases show enhanced sequence correlation between charged residues
Open Access
- 8 April 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 20 (15) , 2345-2354
- https://doi.org/10.1093/bioinformatics/bth245
Abstract
Motivation: Function of proteins or a network of interacting proteins often involves communication between residues that are well separated in sequence. The classic example is the participation of distant residues in allosteric regulation. Bioinformatic and structural analysis methods have been introduced to infer residues that are correlated. Recently, increasing attention has been paid to obtain the sequence properties that determine the tendency of disease-related proteins (Aβ peptides, prion proteins, transthyretin, etc.) to aggregate and form fibrils. Motivated in part by the need to identify sequence characteristics that indicate a tendency to aggregate, we introduce a general method that probes covariations in charged residues along the sequence in a given protein family. The method, which involves computing the sequence correlation entropy (SCE) using the quenched probability Psk(i,j) of finding a residue pair at a given sequence separation, sk, allows us to classify protein families in terms of their SCE. Our general approach may be a useful way in obtaining evolutionary covariations of amino acid residues on a genome wide level. Results: We use a combination of SCE and clustering based on the principle component analysis to classify the protein families. From an analysis of 839 families, covering ∼500 000 sequences, we find that proteins with relatively low values of SCE are predominantly associated with various diseases. In several families, residues that give rise to peaks in Psk(i,j) are clustered in the three-dimensional structure. For the class of proteins with low SCE values, there are significant numbers of mixed charged-hydrophobic (CH) and charged-polar (CP) runs. Our findings suggest that the low values of SCE and the presence of (CH) and/or (CP) may be indicative of disease association or tendency to aggregate. Our results led to the hypothesis that functions of proteins with similar SCE values may be linked. The hypothesis is validated with a few anecdotal examples. The present results also lead to the prediction that the overall charge correlations in proteins affect the kinetics of amyloid formation—a feature that is common to all proteins implicated in neurodegenerative diseases.Keywords
All Related Versions
This publication has 10 references indexed in Scilit:
- Rationalization of the effects of mutations on peptide andprotein aggregation ratesNature, 2003
- Emerging ideas on the molecular basis of protein and peptide aggregationCurrent Opinion in Structural Biology, 2003
- Charge states rather than propensity for β‐structure determine enhanced fibrillogenesis in wild‐type Alzheimer's β‐amyloid peptide compared to E22Q Dutch mutantProtein Science, 2002
- Functional Interactions of Nucleocapsid Protein of Feline Immunodeficiency Virus and Cellular Prion Protein with the Viral RNAPublished by Elsevier ,2002
- DNA Converts Cellular Prion Protein into the β-Sheet Conformation and Inhibits Prion Peptide AggregationJournal of Biological Chemistry, 2001
- The prion protein has DNA strand transfer properties similar to retroviral nucleocapsid protein 1 1Edited by J. KarnJournal of Molecular Biology, 2001
- Amyloid Fibril Formation by Aβ16-22, a Seven-Residue Fragment of the Alzheimer's β-Amyloid Peptide, and Structural Characterization by Solid State NMRBiochemistry, 2000
- An analysis of simultaneous variation in protein structuresProtein Engineering, Design and Selection, 1997
- How frequent are correlated changes in families of protein sequences?Proceedings of the National Academy of Sciences, 1994
- Compensating changes in protein multiple sequence alignmentsProtein Engineering, Design and Selection, 1994