ترغب بنشر مسار تعليمي؟ اضغط هنا

Reconstructing pedigrees: a stochastic perspective

123   0   0.0 ( 0 )
 نشر من قبل Bhalchandra Thatte
 تاريخ النشر 2007
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

A pedigree is a directed graph that describes how individuals are related through ancestry in a sexually-reproducing population. In this paper we explore the question of whether one can reconstruct a pedigree by just observing sequence data for present day individuals. This is motivated by the increasing availability of genomic sequences, but in this paper we take a more theoretical approach and consider what models of sequence evolution might allow pedigree reconstruction (given sufficiently long sequences). Our results complement recent work that showed that pedigree reconstruction may be fundamentally impossible if one uses just the degrees of relatedness between different extant individuals. We find that for certain stochastic processes, pedigrees can be recovered up to isomorphism from sufficiently long sequences.



قيم البحث

اقرأ أيضاً

Pedigrees are directed acyclic graphs that represent ancestral relationships between individuals in a population. Based on a schematic recombination process, we describe two simple Markov models for sequences evolving on pedigrees - Model R (recombin ations without mutations) and Model RM (recombinations with mutations). For these models, we ask an identifiability question: is it possible to construct a pedigree from the joint probability distribution of extant sequences? We present partial identifiability results for general pedigrees: we show that when the crossover probabilities are sufficiently small, certain spanning subgraph sequences can be counted from the joint distribution of extant sequences. We demonstrate how pedigrees that earlier seemed difficult to distinguish are distinguished by counting their spanning subgraph sequences.
We introduce a new algorithm called {sc Rec-Gen} for reconstructing the genealogy or textit{pedigree} of an extant population purely from its genetic data. We justify our approach by giving a mathematical proof of the effectiveness of {sc Rec-Gen} wh en applied to pedigrees from an idealized generative model that replicates some of the features of real-world pedigrees. Our algorithm is iterative and provides an accurate reconstruction of a large fraction of the pedigree while having relatively low emph{sample complexity}, measured in terms of the length of the genetic sequences of the population. We propose our approach as a prototype for further investigation of the pedigree reconstruction problem toward the goal of applications to real-world examples. As such, our results have some conceptual bearing on the increasingly important issue of genomic privacy.
The Roma people, living throughout Europe, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1000-1500 years ago. Geneti c inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry-deriving from a combination of European and South Asian sources- and that the date of admixture of South Asian and European ancestry was about 850 years ago. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which we hypothesize was followed by a major demographic expansion once the population arrived in Europe.
144 - Anders Eriksson 2008
Understanding the time evolution of fragmented animal populations and their habitats, connected by migration, is a problem of both theoretical and practical interest. This paper presents a method for calculating the time evolution of the habitats pop ulation size distribution from a general stochastic dynamic within each habitat, using a deterministic approximation which becomes exact for an infinite number of habitats. Fragmented populations are usually thought to be characterized by a separation of time scale between, on the one hand, colonization and extinction of habitats and, on the other hand, the local population dynamics within each habitat. The analysis in this paper suggests an alternative view: the effective population dynamic stems from a law of large numbers, where stochastic fluctuations in population size of single habitats are buffered through the dispersal pool so that the global population dynamic remains approximately smooth. For illustration, the deterministic approximation is compared to simulations of a stochastic model with density dependent local recruitment and mortality. The article is concluded with a discussion of the general implications of the results, and possible extensions of the method.
Tumors often contain multiple subpopulations of cancerous cells defined by distinct somatic mutations. We describe a new method, PhyloWGS, that can be applied to WGS data from one or more tumor samples to reconstruct complete genotypes of these subpo pulations based on variant allele frequencies (VAFs) of point mutations and population frequencies of structural variations. We introduce a principled phylogenic correction for VAFs in loci affected by copy number alterations and we show that this correction greatly improves subclonal reconstruction compared to existing methods.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا