Selection maintaining protein stability at equilibrium


Abstract in English

The common understanding of protein evolution has been that neutral or slightly deleterious mutations are fixed by random drift, and evolutionary rate is determined primarily by the proportion of neutral mutations. However, recent studies have revealed that highly expressed genes evolve slowly because of fitness costs due to misfolded proteins. Here we study selection maintaining protein stability. Protein fitness is taken to be $s = kappa exp(betaDelta G) (1 - exp(betaDeltaDelta G))$, where $s$ and $DeltaDelta G$ are selective advantage and stability change of a mutant protein, $Delta G$ is the folding free energy of the wild-type protein, and $kappa$ represents protein abundance and indispensability. The distribution of $DeltaDelta G$ is approximated to be a bi-Gaussian function, which represents structurally slightly- or highly-constrained sites. Also, the mean of the distribution is negatively proportional to $Delta G$. The evolution of this gene has an equilibrium ($Delta G_e$) of protein stability, the range of which is consistent with experimental values. The probability distribution of $K_a/K_s$, the ratio of nonsynonymous to synonymous substitution rate per site, over fixed mutants in the vicinity of the equilibrium shows that nearly neutral selection is predominant only in low-abundant, non-essential proteins of $Delta G_e > -2.5$ kcal/mol. In the other proteins, positive selection on stabilizing mutations is significant to maintain protein stability at equilibrium as well as random drift on slightly negative mutations, although the average $langle K_a/K_s rangle$ is less than 1. Slow evolutionary rates can be caused by high protein abundance/indispensability, which produces positive shifts of $DeltaDelta G$ through decreasing $Delta G_e$, and by strong structural constraints, which directly make $DeltaDelta G$ more positive.

Download