Abstract
According to the observed alignment pattern (i.e., amino acid configuration), we studied two basic types of functional divergence of a protein family. Type I functional divergence after gene duplication results in altered functional constraints (i.e., different evolutionary rate) between duplicate genes, whereas type II results in no altered functional constraints but radical change in amino acid property between them (e.g., charge, hydrophobicity, etc.). Two statistical approaches, i.e., the subtree likelihood and the whole-tree likelihood, were developed for estimating the coefficients of (type I or type II) functional divergence. Numerical algorithms for obtaining maximum-likelihood estimates are also provided. Moreover, a posterior-based site-specific profile is implemented to predict critical amino acid residues that are responsible for type I and/or type II functional divergence after gene duplication. We compared the current likelihood with a fast method developed previously by examples; both show similar results. For handling altered functional constraints (type I functional divergence) in the large gene family with many member genes (clusters), which appears to be a normal case in postgenomics, the subtree likelihood provides a solution that is computationally feasible and robust against the uncertainty of the phylogeny. The cost of this feasibility is the approximation when frequencies of amino acids are very skewed. The potential bias and correction are discussed.