Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Estimating effective population size changes from preferentially sampled genetic sequences

76 0 0.0 ( 0 )

Download Cite

Added by Vladimir Minin

Publication date 2019

fields Biology Mathematical Statistics

and research's language is English

Authors Michael D. Karcher - Marc A. Suchard - Gytis Dudas

Populations and Evolution Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Coalescent theory combined with statistical modeling allows us to estimate effective population size fluctuations from molecular sequences of individuals sampled from a population of interest. When sequences are sampled serially through time and the distribution of the sampling times depends on the effective population size, explicit statistical modeling of sampling times improves population size estimation. Previous work assumed that the genealogy relating sampled sequences is known and modeled sampling times as an inhomogeneous Poisson process with log-intensity equal to a linear function of the log-transformed effective population size. We improve this approach in two ways. First, we extend the method to allow for joint Bayesian estimation of the genealogy, effective population size trajectory, and other model parameters. Next, we improve the sampling time model by incorporating additional sources of information in the form of time-varying covariates. We validate our new modeling framework using a simulation study and apply our new methodology to analyses of population dynamics of seasonal influenza and to the recent Ebola virus outbreak in West Africa.

rate research

Effective and scalable clustering of SARS-CoV-2 sequences

219 - Sarwan Ali , Tamkanat-E-Ali , Muhammad Asad Khan 2021

SARS-CoV-2, like any other virus, continues to mutate as it spreads, according to an evolutionary process. Unlike any other virus, the number of currently available sequences of SARS-CoV-2 in public databases such as GISAID is already several million. This amount of data has the potential to uncover the evolutionary dynamics of a virus like never before. However, a million is already several orders of magnitude beyond what can be processed by the traditional methods designed to reconstruct a viruss evolutionary history, such as those that build a phylogenetic tree. Hence, new and scalable methods will need to be devised in order to make use of the ever increasing number of viral sequences being collected. Since identifying variants is an important part of understanding the evolution of a virus, in this paper, we propose an approach based on clustering sequences to identify the current major SARS-CoV-2 variants. Using a $k$-mer based feature vector generation and efficient feature selection methods, our approach is effective in identifying variants, as well as being efficient and scalable to millions of sequences. Such a clustering method allows us to show the relative proportion of each variant over time, giving the rate of spread of each variant in different locations -- something which is important for vaccine development and distribution. We also compute the importance of each amino acid position of the spike protein in identifying a given variant in terms of information gain. Positions of high variant-specific importance tend to agree with those reported by the USAs Centers for Disease Control and Prevention (CDC), further demonstrating our approach.

Populations and Evolution Machine Learning

Detecting range expansions from genetic data

320 - Benjamin M Peter , Montgomery Slatkin 2013

We propose a method that uses genetic data to test for the occurrence of a recent range expansion and to infer the location of the origin of the expansion. We introduce a statistic for pairs of populations $psi$ (the directionality index) that detects asymmetries in the two-dimensional allele frequency spectrum caused by the series of founder events that happen during an expansion. Such asymmetry arises because low frequency alleles tend to be lost during founder events, thus creating clines in the frequencies of surviving low-frequency alleles. Using simulations, we further show that $psi$ is more powerful for detecting range expansions than both $F_{ST}$ and clines in heterozygosity. We illustrate the utility of $psi$ by applying it to a data set from modern humans and show how we can include more complicated scenarios such as multiple expansion origins or barriers to migration in the model.

Populations and Evolution

Complexity in animal communication: Estimating the size of N-Gram structures

375 - Reginald D. Smith 2013

In this paper, new techniques that allow conditional entropy to estimate the combinatorics of symbols are applied to animal communication studies to estimate the communications repertoire size. By using the conditional entropy estimates at multiple orders, the paper estimates the total repertoire sizes for animal communication across bottlenose dolphins, humpback whales, and several species of birds for N-grams length one to three. In addition to discussing the impact of this method on studies of animal communication complexity, the reliability of these estimates is compared to other methods through simulation. While entropy does undercount the total repertoire size due to rare N-grams, it gives a more accurate picture of the most frequently used repertoire than just repertoire size alone.

Populations and Evolution Information Theory Information Theory

A dynamic modeling tool for estimating healthcare demand from the COVID19 epidemic and evaluating population-wide interventions

68 - Gabriel Rainisch 2020

Populations and Evolution

The opportunities and challenges of integrating population histories into genetic studies of diverse populations: a motivating example from Native Hawaiians

79 - Charleston W.K. Chiang 2020

There is an urgent and well-recognized need to extend genetic studies to diverse populations, but several obstacles continue to be prohibitive, including (but not limited to) the difficulty of recruiting individuals from diverse populations in large numbers and the lack of representation in available genomic references. These obstacles notwithstanding, studying multiple diverse populations would provide informative, population-specific insights. Using Native Hawaiians as an example of an understudied population with a unique evolutionary history, I will argue that by developing key genomic resources and integrating evolutionary thinking into genetic epidemiology, we will have the opportunity to efficiently advance our knowledge of the genetic risk factors, ameliorate health disparity, and improve healthcare in this underserved population.

Populations and Evolution

comments

Fetching comments

Alshahba Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Estimating effective population size changes from preferentially sampled genetic sequences

Ask ChatGPT about the research

No Arabic abstract

Read More