ترغب بنشر مسار تعليمي؟ اضغط هنا

A model-based approach for identifying signatures of balancing selection in genetic data

189   0   0.0 ( 0 )
 نشر من قبل Michael DeGiorgio
 تاريخ النشر 2013
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

While much effort has focused on detecting positive and negative directional selection in the human genome, relatively little work has been devoted to balancing selection. This lack of attention is likely due to the paucity of sophisticated methods for identifying sites under balancing selection. Here we develop two composite likelihood ratio tests for detecting balancing selection. Using simulations, we show that these methods outperform competing methods under a variety of assumptions and demographic models. We apply the new methods to whole-genome human data, and find a number of previously-identified loci with strong evidence of balancing selection, including several HLA genes. Additionally, we find evidence for many novel candidates, the strongest of which is FANK1, an imprinted gene that suppresses apoptosis, is expressed during meiosis in males, and displays marginal signs of segregation distortion. We hypothesize that balancing selection acts on this locus to stabilize the segregation distortion and negative fitness effects of the distorter allele. Thus, our methods are able to reproduce many previously-hypothesized signals of balancing selection, as well as discover novel interesting candidates.

قيم البحث

اقرأ أيضاً

We consider a population constituted by two types of individuals; each of them can produce offspring in two different islands (as a particular case the islands can be interpreted as active or dormant individuals). We model the evolution of the popula tion of each type using a two-type Feller diffusion with immigration, and we study the frequency of one of the types, in each island, when the total population size in each island is forced to be constant at a dense set of times. This leads to the solution of a SDE which we call the asymmetric two-island frequency process. We derive properties of this process and obtain a large population limit when the total size of each island tends to infinity. Additionally, we compute the fluctuations of the process around its deterministic limit. We establish conditions under which the asymmetric two-island frequency process has a moment dual. The dual is a continuous-time two-dimensional Markov chain that can be interpreted in terms of mutation, branching, pairwise branching, coalescence, and a novel mixed selection-migration term. Also, we conduct a stability analysis of the limiting deterministic dynamical system and present some numerical results to study fixation and a new form of balancing selection. When restricting to the seedbank model, we observe that some combinations of the parameters lead to balancing selection. Besides finding yet another way in which genetic reservoirs increase the genetic variability, we find that if a population that sustains a seedbank competes with one that does not, the seed producers will have a selective advantage if they reproduce faster, but will not have a selective disadvantage if they reproduce slower: their worst case scenario is balancing selection.
We propose a method that uses genetic data to test for the occurrence of a recent range expansion and to infer the location of the origin of the expansion. We introduce a statistic for pairs of populations $psi$ (the directionality index) that detect s asymmetries in the two-dimensional allele frequency spectrum caused by the series of founder events that happen during an expansion. Such asymmetry arises because low frequency alleles tend to be lost during founder events, thus creating clines in the frequencies of surviving low-frequency alleles. Using simulations, we further show that $psi$ is more powerful for detecting range expansions than both $F_{ST}$ and clines in heterozygosity. We illustrate the utility of $psi$ by applying it to a data set from modern humans and show how we can include more complicated scenarios such as multiple expansion origins or barriers to migration in the model.
We investigate a continuous time, probability measure-valued dynamical system that describes the process of mutation-selection balance in a context where the population is infinite, there may be infinitely many loci, and there are weak assumptions on selective costs. Our model arises when we incorporate very general recombination mechanisms into a previous model of mutation and selection from Steinsaltz, Evans and Wachter (2005) and take the relative strength of mutation and selection to be sufficiently small. The resulting dynamical system is a flow of measures on the space of loci. Each such measure is the intensity measure of a Poisson random measure on the space of loci: the points of a realization of the random measure record the set of loci at which the genotype of a uniformly chosen individual differs from a reference wild type due to an accumulation of ancestral mutations. Our motivation for working in such a general setting is to provide a basis for understanding mutation-driven changes in age-specific demographic schedules that arise from the complex interaction of many genes, and hence to develop a framework for understanding the evolution of aging. We establish the existence and uniqueness of the dynamical system, provide conditions for the existence and stability of equilibrium states, and prove that our continuous-time dynamical system is the limit of a sequence of discrete-time infinite population mutation-selection-recombination models in the standard asymptotic regime where selection and mutation are weak relative to recombination and both scale at the same infinitesimal rate in the limit.
RNA-Seq technology allows for studying the transcriptional state of the cell at an unprecedented level of detail. Beyond quantification of whole-gene expression, it is now possible to disentangle the abundance of individual alternatively spliced tran script isoforms of a gene. A central question is to understand the regulatory processes that lead to differences in relative abundance variation due to external and genetic factors. Here, we present a mixed model approach that allows for (i) joint analysis and genetic mapping of multiple transcript isoforms and (ii) mapping of isoform-specific effects. Central to our approach is to comprehensively model the causes of variation and correlation between transcript isoforms, including the genomic background and technical quantification uncertainty. As a result, our method allows to accurately test for shared as well as transcript-specific genetic regulation of transcript isoforms and achieves substantially improved calibration of these statistical tests. Experiments on genotype and RNA-Seq data from 126 human HapMap individuals demonstrate that our model can help to obtain a more fine-grained picture of the genetic basis of gene expression variation.
Understanding dynamics of an outbreak like that of COVID-19 is important in designing effective control measures. This study aims to develop an agent based model that compares changes in infection progression by manipulating different parameters in a synthetic population. Model input includes population characteristics like age, sex, working status etc. of each individual and other factors influencing disease dynamics. Depending on number of epicentres of infection, location of primary cases, sensitivity, proportion of asymptomatic and frequency or duration of lockdown, our simulator tracks every individual and hence infection progression through community over time. In a closed community of 10000 people, it is seen that without any lockdown, number of cases peak around 6th week and wanes off around 15th week. If primary case is located inside dense population cluster like slums, cases peak early and wane off slowly. With introduction of lockdown, cases peak at slower rate. If sensitivity of identifying infection decreases, cases and deaths increase. Number of cases declines with increase in proportion of asymptomatic cases. The model is robust and provides reproducible estimates with realistic parameter values. It also guides in identifying measures to control outbreak in a community. It is flexible in accommodating different parameters like infectivity period, yield of testing, socio-economic strata, daily travel, awareness level, population density, social distancing, lockdown etc. and can be tailored to study other infections with similar transmission pattern.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا