No Arabic abstract
Since the sequencing of large genomes, many statistical features of their sequences have been found. One intriguing feature is that certain subsequences are much more abundant than others. In fact, abundances of subsequences of a given length are distributed with a scale-free power-law tail, resembling properties of human texts, such as the Zipfs law. Despite recent efforts, the understanding of this phenomenon is still lacking. Here we find that selfish DNA elements, such as those belonging to the Alu family of repeats, dominate the power-law tail. Interestingly, for the Alu elements the power-law exponent increases with the length of the considered subsequences. Motivated by these observations, we develop a model of selfish DNA expansion. The predictions of this model qualitatively and quantitatively agree with the empirical observations. This allows us to estimate parameters for the process of selfish DNA spreading in a genome during its evolution. The obtained results shed light on how evolution of selfish DNA elements shapes non-trivial statistical properties of genomes.
Contrary to long-held views, recent evidence indicates that $textit{de novo}$ birth of genes is not only possible, but is surprisingly prevalent: a substantial fraction of eukaryotic genomes are composed of orphan genes, which show no homology with any conserved genes. And a remarkably large proportion of orphan genes likely originated $textit{de novo}$ from non-genic regions. Here, using a parsimonious mathematical model, we investigate the probability and timescale of $textit{de novo}$ gene birth due to spontaneous mutations. We trace how an initially non-genic locus accumulates beneficial mutations to become a gene. We sample across a wide range of biologically feasible distributions of fitness effects (DFE) of mutations, and calculate the conditions conducive to gene birth. We find that in a time frame of millions of years, gene birth is highly likely for a wide range of DFEs. Moreover, when we allow DFEs to fluctuate, which is expected given the long time frame, gene birth in the model becomes practically inevitable. This supports the idea that gene birth is a ubiquitous process, and should occur in a wide variety of organisms. Our results also demonstrate that intergenic regions are not inactive and silent but are more like dynamic storehouses of potential genes.
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves into a critical and constructive attitude in our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Although accumulation of molecular damage is suggested to be an important molecular mechanism of aging, a quantitative link between the dynamics of damage accumulation and mortality of species has so far remained elusive. To address this question, we examine stability properties of a generic gene regulatory network (GRN) and demonstrate that many characteristics of aging and the associated population mortality rate emerge as inherent properties of the critical dynamics of gene regulation and metabolic levels. Based on the analysis of age-dependent changes in gene-expression and metabolic profiles in Drosophila melanogaster, we explicitly show that the underlying GRNs are nearly critical and inherently unstable. This instability manifests itself as aging in the form of distortion of gene expression and metabolic profiles with age, and causes the characteristic increase in mortality rate with age as described by a form of the Gompertz law. In addition, we explain late-life mortality deceleration observed at very late ages for large populations. We show that aging contains a stochastic component, related to accumulation of regulatory errors in transcription/translation/metabolic pathways due to imperfection of signaling cascades in the network and of responses to environmental factors. We also establish that there is a strong deterministic component, suggesting genetic control. Since mortality in humans, where it is characterized best, is strongly associated with the incidence of age-related diseases, our findings support the idea that aging is the driving force behind the development of chronic human diseases.
The incubation period of a disease is the time between an initiating pathologic event and the onset of symptoms. For typhoid fever, polio, measles, leukemia and many other diseases, the incubation period is highly variable. Some affected people take much longer than average to show symptoms, leading to a distribution of incubation periods that is right skewed and often approximately lognormal. Although this statistical pattern was discovered more than sixty years ago, it remains an open question to explain its ubiquity. Here we propose an explanation based on evolutionary dynamics on graphs. For simple models of a mutant or pathogen invading a network-structured population of healthy cells, we show that skewed distributions of incubation periods emerge for a wide range of assumptions about invader fitness, competition dynamics, and network structure. The skewness stems from stochastic mechanisms associated with two classic problems in probability theory: the coupon collector and the random walk. Unlike previous explanations that rely crucially on heterogeneity, our results hold even for homogeneous populations. Thus, we predict that two equally healthy individuals subjected to equal doses of equally pathogenic agents may, by chance alone, show remarkably different time courses of disease.
Evolutionary game theory has traditionally assumed that all individuals in a population interact with each other between reproduction events. We show that eliminating this restriction by explicitly considering the time scales of interaction and selection leads to dramatic changes in the outcome of evolution. Examples include the selection of the inefficient strategy in the Harmony and Stag-Hunt games, and the disappearance of the coexistence state in the Snowdrift game. Our results hold for any population size and in the presence of a background of fitness.