No Arabic abstract
More than any other species, humans form social ties to individuals who are neither kin nor mates, and these ties tend to be with similar people. Here, we show that this similarity extends to genotypes. Across the whole genome, friends genotypes at the SNP level tend to be positively correlated (homophilic); however, certain genotypes are negatively correlated (heterophilic). A focused gene set analysis suggests that some of the overall correlation can be explained by specific systems; for example, an olfactory gene set is homophilic and an immune system gene set is heterophilic. Finally, homophilic genotypes exhibit significantly higher measures of positive selection, suggesting that, on average, they may yield a synergistic fitness advantage that has been helping to drive recent human evolution.
Suppose we have $n$ different types of self-replicating entity, with the population $P_i$ of the $i$th type changing at a rate equal to $P_i$ times the fitness $f_i$ of that type. Suppose the fitness $f_i$ is any continuous function of all the populations $P_1, dots, P_n$. Let $p_i$ be the fraction of replicators that are of the $i$th type. Then $p = (p_1, dots, p_n)$ is a time-dependent probability distribution, and we prove that its speed as measured by the Fisher information metric equals the variance in fitness. In rough terms, this says that the speed at which information is updated through natural selection equals the variance in fitness. This result can be seen as a modified version of Fishers fundamental theorem of natural selection. We compare it to Fishers original result as interpreted by Price, Ewens and Edwards.
To identify genetic changes underlying dog domestication and reconstruct their early evolutionary history, we analyzed novel high-quality genome sequences of three gray wolves, one from each of three putative centers of dog domestication, two ancient dog lineages (Basenji and Dingo) and a golden jackal as an outgroup. We find dogs and wolves diverged through a dynamic process involving population bottlenecks in both lineages and post-divergence gene flow, which confounds previous inferences of dog origins. In dogs, the domestication bottleneck was severe involving a 17 to 49-fold reduction in population size, a much stronger bottleneck than estimated previously from less intensive sequencing efforts. A sharp bottleneck in wolves occurred soon after their divergence from dogs, implying that the pool of diversity from which dogs arose was far larger than represented by modern wolf populations. Conditional on mutation rate, we narrow the plausible range for the date of initial dog domestication to an interval from 11 to 16 thousand years ago. This period predates the rise of agriculture, implying that the earliest dogs arose alongside hunter-gathers rather than agriculturists. Regarding the geographic origin of dogs, we find that surprisingly, none of the extant wolf lineages from putative domestication centers are more closely related to dogs, and the sampled wolves instead form a sister monophyletic clade. This result, in combination with our finding of dog-wolf admixture during the process of domestication, suggests a re-evaluation of past hypotheses of dog origin is necessary. Finally, we also detect signatures of selection, including evidence for selection on genes implicated in morphology, metabolism, and neural development. Uniquely, we find support for selective sweeps at regulatory sites suggesting gene regulatory changes played a critical role in dog domestication.
W. D. Hamiltons celebrated formula for the age-specific force of natural selection furnishes predictions for senescent mortality due to mutation accumulation, at the price of reliance on a linear approximation. Applying to Hamiltons setting the full non-linear demographic model for mutation accumulation of Evans et al. (2007), we find surprising differences. Non-linear interactions cause the collapse of Hamilton-style predictions in the most commonly studied case, refine predictions in other cases, and allow Walls of Death at ages before the end of reproduction. Haldanes Principle for genetic load has an exact but unfamiliar generalization.
Using brain imaging quantitative traits (QTs) to identify the genetic risk factors is an important research topic in imaging genetics. Many efforts have been made via building linear models, e.g. linear regression (LR), to extract the association between imaging QTs and genetic factors such as single nucleotide polymorphisms (SNPs). However, to the best of our knowledge, these linear models could not fully uncover the complicated relationship due to the locis elusive and diverse impacts on imaging QTs. Though deep learning models can extract the nonlinear relationship, they could not select relevant genetic factors. In this paper, we proposed a novel multi-task deep feature selection (MTDFS) method for brain imaging genetics. MTDFS first adds a multi-task one-to-one layer and imposes a hybrid sparsity-inducing penalty to select relevant SNPs making significant contributions to abnormal imaging QTs. It then builds a multi-task deep neural network to model the complicated associations between imaging QTs and SNPs. MTDFS can not only extract the nonlinear relationship but also arms the deep neural network with the feature selection capability. We compared MTDFS to both LR and single-task DFS (DFS) methods on the real neuroimaging genetic data. The experimental results showed that MTDFS performed better than both LR and DFS in terms of the QT-SNP relationship identification and feature selection. In a word, MTDFS is powerful for identifying risk loci and could be a great supplement to the method library for brain imaging genetics.
The advent of accessible ancient DNA technology now allows the direct ascertainment of allele frequencies in ancestral populations, thereby enabling the use of allele frequency time series to detect and estimate natural selection. Such direct observations of allele frequency dynamics are expected to be more powerful than inferences made using patterns of linked neutral variation obtained from modern individuals. We develop a Bayesian method to make use of allele frequency time series data and infer the parameters of general diploid selection, along with allele age, in non-equilibrium populations. We introduce a novel path augmentation approach, in which we use Markov chain Monte Carlo to integrate over the space of allele frequency trajectories consistent with the observed data. Using simulations, we show that this approach has good power to estimate selection coefficients and allele age. Moreover, when applying our approach to data on horse coat color, we find that ignoring a relevant demographic history can significantly bias the results of inference. Our approach is made available in a C++ software package.