ﻻ يوجد ملخص باللغة العربية
The probability distribution of sequences with maximum entropy that satisfies a given amino acid composition at each site and a given pairwise amino acid frequency at each site pair is a Boltzmann distribution with $exp(-psi_N)$, where the total interaction $psi_N$ is represented as the sum of one body and pairwise interactions. A protein folding theory based on the random energy model (REM) indicates that the equilibrium ensemble of natural protein sequences is a canonical ensemble characterized by $exp(-Delta G_{ND}/k_B T_s)$ or by $exp(- G_{N}/k_B T_s)$ if an amino acid composition is kept constant, meaning $psi_N = Delta G_{ND}/k_B T_s +$ constant, where $Delta G_{ND} equiv G_N - G_D$, $G_N$ and $G_D$ are the native and denatured free energies, and $T_s$ is the effective temperature of natural selection. Here, we examine interaction changes ($Delta psi_N$) due to single nucleotide nonsynonymous mutations, and have found that the variance of their $Delta psi_N$ over all sites hardly depends on the $psi_N$ of each homologous sequence, indicating that the variance of $Delta G_N (= k_B T_s Delta psi_N)$ is nearly constant irrespective of protein families. As a result, $T_s$ is estimated from the ratio of the variance of $Delta psi_N$ to that of a reference protein, which is determined by a direct comparison between $DeltaDelta psi_{ND} (simeq Delta psi_N)$ and experimental $DeltaDelta G_{ND}$. Based on the REM, glass transition temperature $T_g$ and $Delta G_{ND}$ are estimated from $T_s$ and experimental melting temperatures ($T_m$) for 14 protein domains. The estimates of $Delta G_{ND}$ agree well with their experimental values for 5 proteins, and those of $T_s$ and $T_g$ are all within a reasonable range. This method is coarse-grained but much simpler in estimating $T_s$, $T_g$ and $DeltaDelta G_{ND}$ than previous methods.
The common understanding of protein evolution has been that neutral or slightly deleterious mutations are fixed by random drift, and evolutionary rate is determined primarily by the proportion of neutral mutations. However, recent studies have reveal
The twenty protein coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic
We study a continuous-time dynamical system that models the evolving distribution of genotypes in an infinite population where genomes may have infinitely many or even a continuum of loci, mutations accumulate along lineages without back-mutation, ad
The role of positive selection in human evolution remains controversial. On the one hand, scans for positive selection have identified hundreds of candidate loci and the genome-wide patterns of polymorphism show signatures consistent with frequent po