No Arabic abstract
This work is about statistical genetics, an interdisciplinary topic between Statistical Physics and Population Biology. Our focus is on the phase of Quasi-Linkage Equilibrium (QLE) which has many similarities to equilibrium statistical mechanics, and how the stability of that phase is lost. The QLE phenomenon was discovered by Motoo Kimura and was extended and generalized to the global genome scale by Neher & Shraiman (2011). What we will refer to as the Kimura-Neher-Shraiman (KNS) theory describes a population evolving due to the mutations, recombination, genetic drift, natural selection (pairwise epistatic fitness). The main conclusion of KNS is that QLE phase exists at sufficiently high recombination rate ($r$) with respect to the variability in selection strength (fitness). Combining these results with the techniques of the Direct Coupling Analysis (DCA) we show that in QLE epistatic fitness can be inferred from the knowledge of the (dynamical) distribution of genotypes in a population. Extending upon our earlier work Zeng & Aurell (2020) here we present an extension to high mutation and recombination rate. We further consider evolution of a population at higher selection strength with respect to recombination and mutation parameters ($r$ and $mu$). We identify a new bi-stable phase which we call the Non-Random Coexistence (NRC) phase where genomic mutations persist in the population without either fixating or disappearing. We also identify an intermediate region in the parameter space where a finite population jumps stochastically between QLE-like state and NRC-like behaviour. The existence of NRC-phase demonstrates that even if statistical genetics at high recombination closely mirrors equilibrium statistical physics, a more apt analogy is non-equilibrium statistical physics with broken detailed balance, where self-sustained dynamical phenomena are ubiquitous.
Marine species reproduce and compete while being advected by turbulent flows. It is largely unknown, both theoretically and experimentally, how population dynamics and genetics are changed by the presence of fluid flows. Discrete agent-based simulations in continuous space allow for accurate treatment of advection and number fluctuations, but can be computationally expensive for even modest organism densities. In this report, we propose an algorithm to overcome some of these challenges. We first provide a thorough validation of the algorithm in one and two dimensions without flow. Next, we focus on the case of weakly compressible flows in two dimensions. This models organisms such as phytoplankton living at a specific depth in the three-dimensional, incompressible ocean experiencing upwelling and/or downwelling events. We show that organisms born at sources in a two-dimensional time-independent flow experience an increase in fixation probability.
Many non-coding RNAs are known to play a role in the cell directly linked to their structure. Structure prediction based on the sole sequence is however a challenging task. On the other hand, thanks to the low cost of sequencing technologies, a very large number of homologous sequences are becoming available for many RNA families. In the protein community, it has emerged in the last decade the idea of exploiting the covariance of mutations within a family to predict the protein structure using the direct-coupling-analysis (DCA) method. The application of DCA to RNA systems has been limited so far. We here perform an assessment of the DCA method on 17 riboswitch families, comparing it with the commonly used mutual information analysis and with state-of-the-art R-scape covariance method. We also compare different flavors of DCA, including mean-field, pseudo-likelihood, and a proposed stochastic procedure (Boltzmann learning) for solving exactly the DCA inverse problem. Boltzmann learning outperforms the other methods in predicting contacts observed in high resolution crystal structures.
Spatial constraints such as rigid barriers affect the dynamics of cell populations, potentially altering the course of natural evolution. In this paper, we study the population genetics of Escherichia coli proliferating in microchannels with open ends. Our experiments reveal that competition among two fluorescently labeled E. coli strains growing in a microchannel generates a stripe pattern aligned with the axial direction of the channel. To account for this observation, we study a lattice population model in which reproducing cells push entire lanes of cells towards the open ends of the channel. By combining mathematical theory, numerical simulations, and experiments, we find that the fixation dynamics is extremely fast along the axial direction, with a logarithmic dependence on the number of cells per lane. In contrast, competition among lanes is a much slower process. We also demonstrate that random mutations that appear in the middle and at the boundaries of the channel are highly likely to reach fixation. By theoretically studying competition between strains of different fitness, we find that the population structure in such a spatially confined system strongly suppresses selection.
The key findings of classical population genetics are derived using a framework based on information theory using the entropies of the allele frequency distribution as a basis. The common results for drift, mutation, selection, and gene flow will be rewritten both in terms of information theoretic measurements and used to draw the classic conclusions for balance conditions and common features of one locus dynamics. Linkage disequilibrium will also be discussed including the relationship between mutual information and r^2 and a simple model of hitchhiking.
Motivated by the famous Waddingtons epigenetic landscape metaphor in developmental biology, biophysicists and applied mathematicians made different proposals to realize this metaphor in a rationalized way. We adopt comprehensive perspectives to systematically investigate three different but closely related realizations in recent literature: namely the potential landscape theory from the steady state distribution of stochastic differential equations (SDEs), the quasi-potential from the large deviation theory, and the construction through SDE decomposition and A-type integral.The connections among these theories are established in this paper. We demonstrate that the quasi-potential is the zero noise limit of the potential landscape. We also show that the potential function in the third proposal coincides with the quasi-potential. The most probable transition path by minimizing the Onsager-Machlup or Freidlin-Wentzell action functional is discussed as well. Furthermore, we compare the difference between local and global quasi-potential through the exchange of limit order for time and noise amplitude. As a consequence of such explorations, we arrive at the existence result for the SDE decomposition while deny its uniqueness in general cases. It is also clarified that the A-type integral is more appropriate to be applied to the decomposed SDEs rather than the original one. Our results contribute to a better understanding of existing landscape theories for biological systems.