No Arabic abstract
Inference with population genetic data usually treats the population pedigree as a nuisance parameter, the unobserved product of a past history of random mating. However, the history of genetic relationships in a given population is a fixed, unobserved object, and so an alternative approach is to treat this network of relationships as a complex object we wish to learn about, by observing how genomes have been noisily passed down through it. This paper explores this point of view, showing how to translate questions about population genetic data into calculations with a Poisson process of mutations on all ancestral genomes. This method is applied to give a robust interpretation to the $f_4$ statistic used to identify admixture, and to design a new statistic that measures covariances in mean times to most recent common ancestor between two pairs of sequences. The method more generally interprets population genetic statistics in terms of sums of specific functions over ancestral genomes, thereby providing concrete, broadly interpretable interpretations for these statistics. This provides a method for describing demographic history without simplified demographic models. More generally, it brings into focus the population pedigree, which is averaged over in model-based demographic inference.
Motivation: We introduce TRONCO (TRanslational ONCOlogy), an open-source R package that implements the state-of-the-art algorithms for the inference of cancer progression models from (epi)genomic mutational profiles. TRONCO can be used to extract population-level models describing the trends of accumulation of alterations in a cohort of cross-sectional samples, e.g., retrieved from publicly available databases, and individual-level models that reveal the clonal evolutionary history in single cancer patients, when multiple samples, e.g., multiple biopsies or single-cell sequencing data, are available. The resulting models can provide key hints in uncovering the evolutionary trajectories of cancer, especially for precision medicine or personalized therapy. Availability: TRONCO is released under the GPL license, it is hosted in the Software section at http://bimib.disco.unimib.it/ and archived also at bioconductor.org. Contact:
[email protected]
We consider a population evolving due to mutation, selection and recombination, where selection includes single-locus terms (additive fitness) and two-loci terms (pairwise epistatic fitness). We further consider the problem of inferring fitness in the evolutionary dynamics from one or several snap-shots of the distribution of genotypes in the population. In the recent literature this has been done by applying the Quasi-Linkage Equilibrium (QLE) regime first obtained by Kimura in the limit of high recombination. Here we show that the approach also works in the interesting regime where the effects of mutations are comparable to or larger than recombination. This leads to a modified main epistatic fitness inference formula where the rates of mutation and recombination occur together. We also derive this formula using by a previously developed Gaussian closure that formally remains valid when recombination is absent. The findings are validated through numerical simulations.
We define the Sampled Moran Genealogy Process, a continuous-time Markov process on the space of genealogies with the demography of the classical Moran process, sampled through time. To do so, we begin by defining the Moran Genealogy Process using a novel representation. We then extend this process to include sampling through time. We derive exact conditional and marginal probability distributions for the sampled process under a stationarity assumption, and an exact expression for the likelihood of any sequence of genealogies it generates. This leads to some interesting observations pertinent to existing phylodynamic methods in the literature. Throughout, our proofs are original and make use of strictly forward-in-time calculations and are exact for all population sizes and sampling processes.
We present a method for estimating epidemic parameters in network-based stochastic epidemic models when the total number of infections is assumed to be small. We illustrate the method by reanalyzing the data from the 2014 Democratic Republic of the Congo (DRC) Ebola outbreak described in Maganga et al. (2014).
In a (two-type) Wright-Fisher diffusion with directional selection and two-way mutation, let $x$ denote todays frequency of the beneficial type, and given $x$, let $h(x)$ be the probability that, among all individuals of todays population, the individual whose progeny will eventually take over in the population is of the beneficial type. Fearnhead [Fearnhead, P., 2002. The common ancestor at a nonneutral locus. J. Appl. Probab. 39, 38-54] and Taylor [Taylor, J. E., 2007. The common ancestor process for a Wright-Fisher diffusion. Electron. J. Probab. 12, 808-847] obtained a series representation for $h(x)$. We develop a construction that contains elements of both the ancestral selection graph and the lookdown construction and includes pruning of certain lines upon mutation. Besides being interesting in its own right, this construction allows a transparent derivation of the series coefficients of $h(x)$ and gives them a probabilistic meaning.