ترغب بنشر مسار تعليمي؟ اضغط هنا

On the Identifiability of Phylogenetic Networks under a Pseudolikelihood model

135   0   0.0 ( 0 )
 نشر من قبل Claudia Solis-Lemus
 تاريخ النشر 2020
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

The Tree of Life is the graphical structure that represents the evolutionary process from single-cell organisms at the origin of life to the vast biodiversity we see today. Reconstructing this tree from genomic sequences is challenging due to the variety of biological forces that shape the signal in the data, and many of those processes like incomplete lineage sorting and hybridization can produce confounding information. Here, we present the mathematical version of the identifiability proofs of phylogenetic networks under the pseudolikelihood model in SNaQ. We establish that the ability to detect different hybridization events depends on the number of nodes on the hybridization blob, with small blobs (corresponding to closely related species) being the hardest to be detected. Our work focuses on level-1 networks, but raises attention to the importance of identifiability studies on phylogenetic inference methods for broader classes of networks.



قيم البحث

اقرأ أيضاً

Maximum likelihood estimators are used extensively to estimate unknown parameters of stochastic trait evolution models on phylogenetic trees. Although the MLE has been proven to converge to the true value in the independent-sample case, we cannot app eal to this result because trait values of different species are correlated due to shared evolutionary history. In this paper, we consider a $2$-state symmetric model for a single binary trait and investigate the theoretical properties of the MLE for the transition rate in the large-tree limit. Here, the large-tree limit is a theoretical scenario where the number of taxa increases to infinity and we can observe the trait values for all species. Specifically, we prove that the MLE converges to the true value under some regularity conditions. These conditions ensure that the tree shape is not too irregular, and holds for many practical scenarios such as trees with bounded edges, trees generated from the Yule (pure birth) process, and trees generated from the coalescent point process. Our result also provides an upper bound for the distance between the MLE and the true value.
Phylogenetic Diversity (PD) is a prominent quantitative measure of the biodiversity of a collection of present-day species (taxa). This measure is based on the evolutionary distance among the species in the collection. Loosely speaking, if $mathcal{T }$ is a rooted phylogenetic tree whose leaf set $X$ represents a set of species and whose edges have real-valued lengths (weights), then the PD score of a subset $S$ of $X$ is the sum of the weights of the edges of the minimal subtree of $mathcal{T}$ connecting the species in $S$. In this paper, we define several natural variants of the PD score for a subset of taxa which are related by a known rooted phylogenetic network. Under these variants, we explore, for a positive integer $k$, the computational complexity of determining the maximum PD score over all subsets of taxa of size $k$ when the input is restricted to different classes of rooted phylogenetic networks
Rooted phylogenetic networks provide a way to describe species relationships when evolution departs from the simple model of a tree. However, networks inferred from genomic data can be highly tangled, making it difficult to discern the main reticulat ion signals present. In this paper, we describe a natural way to transform any rooted phylogenetic network into a simpler canonical network, which has desirable mathematical and computational properties, and is based only on the visible nodes in the original network. The method has been implemented and we demonstrate its application to some examples.
Phylogenetic networks are generalizations of phylogenetic trees that allow the representation of reticulation events such as horizontal gene transfer or hybridization, and can also represent uncertainty in inference. A subclass of these, tree-based p hylogenetic networks, have been introduced to capture the extent to which reticulate evolution nevertheless broadly follows tree-like patterns. Several important operations that change a general phylogenetic network have been developed in recent years, and are important for allowing algorithms to move around spaces of networks; a vital ingredient in finding an optimal network given some biological data. A key such operation is the Nearest Neighbor Interchange, or NNI. While it is already known that the space of unrooted phylogenetic networks is connected under NNI, it has been unclear whether this also holds for the subspace of tree-based networks. In this paper we show that the space of unrooted tree-based phylogenetic networks is indeed connected under the NNI operation. We do so by explicitly showing how to get from one such network to another one without losing tree-basedness along the way. Moreover, we introduce some new concepts, for instance ``shoat networks, and derive some interesting aspects concerning tree-basedness. Last, we use our results to derive an upper bound on the size of the space of tree-based networks.
In phylogenetic studies, biologists often wish to estimate the ancestral discrete character state at an interior vertex $v$ of an evolutionary tree $T$ from the states that are observed at the leaves of the tree. A simple and fast estimation method - -- maximum parsimony --- takes the ancestral state at $v$ to be any state that minimises the number of state changes in $T$ required to explain its evolution on $T$. In this paper, we investigate the reconstruction accuracy of this estimation method further, under a simple symmetric model of state change, and obtain a number of new results, both for 2-state characters, and $r$--state characters ($r>2$). Our results rely on establishing new identities and inequalities, based on a coupling argument that involves a simpler `coin toss approach to ancestral state reconstruction.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا