New community

Subscribe to the gold package and get unlimited access to Shamra Academy

19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology

73 0 0.0 ( 0 )

Download Cite

Added by Vladimir Minin

Publication date 2018

fields Biology Mathematical Statistics

and research's language is English

Authors Mathieu Fourment - Andrew F. Magee - Chris Whidden

Populations and Evolution Computation

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real datasets. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.

rate research

Systematic Exploration of the High Likelihood Set of Phylogenetic Tree Topologies

67 - Chris Whidden , Brian C. Claywell , Thayer Fisher 2018

Bayesian Markov chain Monte Carlo explores tree space slowly, in part because it frequently returns to the same tree topology. An alternative strategy would be to explore tree space systematically, and never return to the same topology. In this paper, we present an efficient parallelized method to map out the high likelihood set of phylogenetic tree topologies via systematic search, which we show to be a good approximation of the high posterior set of tree topologies. Here `likelihood of a topology refers to the tree likelihood for the corresponding tree with optimized branch lengths. We call this method `phylogenetic topographer (PT). The PT strategy is very simple: starting in a number of local topology maxima (obtained by hill-climbing from random starting points), explore out using local topology rearrangements, only continuing through topologies that are better than than some likelihood threshold below the best observed topology. We show that the normalized topology likelihoods are a useful proxy for the Bayesian posterior probability of those topologies. By using a non-blocking hash table keyed on unique representations of tree topologies, we avoid visiting topologies more than once across all concurrent threads exploring tree space. We demonstrate that PT can be used directly to approximate a Bayesian consensus tree topology. When combined with an accurate means of evaluating per-topology marginal likelihoods, PT gives an alternative procedure for obtaining Bayesian posterior distributions on phylogenetic tree topologies.

Populations and Evolution Data Structures and Algorithms

The space of tree-based phylogenetic networks

173 - Mareike Fischer , Andrew Francis 2019

Phylogenetic networks are generalizations of phylogenetic trees that allow the representation of reticulation events such as horizontal gene transfer or hybridization, and can also represent uncertainty in inference. A subclass of these, tree-based phylogenetic networks, have been introduced to capture the extent to which reticulate evolution nevertheless broadly follows tree-like patterns. Several important operations that change a general phylogenetic network have been developed in recent years, and are important for allowing algorithms to move around spaces of networks; a vital ingredient in finding an optimal network given some biological data. A key such operation is the Nearest Neighbor Interchange, or NNI. While it is already known that the space of unrooted phylogenetic networks is connected under NNI, it has been unclear whether this also holds for the subspace of tree-based networks. In this paper we show that the space of unrooted tree-based phylogenetic networks is indeed connected under the NNI operation. We do so by explicitly showing how to get from one such network to another one without losing tree-basedness along the way. Moreover, we introduce some new concepts, for instance ``shoat networks, and derive some interesting aspects concerning tree-basedness. Last, we use our results to derive an upper bound on the size of the space of tree-based networks.

Populations and Evolution Combinatorics

Unrooted non-binary tree-based phylogenetic networks

84 - Mareike Fischer , Michelle Galla , Lina Herbst 2018

Phylogenetic networks are a generalization of phylogenetic trees allowing for the representation of non-treelike evolutionary events such as hybridization. Typically, such networks have been analyzed based on their `level, i.e. based on the complexity of their 2-edge-connected components. However, recently the question of how `treelike a phylogenetic network is has become the center of attention in various studies. This led to the introduction of emph{tree-based networks}, i.e. networks that can be constructed from a phylogenetic tree, called the emph{base tree}, by adding additional edges. While the concept of tree-basedness was originally introduced for rooted phylogenetic networks, it has recently also been considered for unrooted networks. In the present study, we compare and contrast findings obtained for unrooted emph{binary} tree-based networks to unrooted emph{non-binary} networks. In particular, while it is known that up to level 4 all unrooted binary networks are tree-based, we show that in the case of non-binary networks, this result only holds up to level 3.

Populations and Evolution Combinatorics

Non-bifurcating phylogenetic tree inference via the adaptive LASSO

87 - Cheng Zhang , Vu Dinh , Frederick A. Matsen IV 2018

Phylogenetic tree inference using deep DNA sequencing is reshaping our understanding of rapidly evolving systems, such as the within-host battle between viruses and the immune system. Densely sampled phylogenetic trees can contain special features, including sampled ancestors in which we sequence a genotype along with its direct descendants, and polytomies in which multiple descendants arise simultaneously. These features are apparent after identifying zero-length branches in the tree. However, current maximum-likelihood based approaches are not capable of revealing such zero-length branches. In this paper, we find these zero-length branches by introducing adaptive-LASSO-type regularization estimators to phylogenetics, deriving their properties, and showing regularization to be a practically useful approach for phylogenetics.

Populations and Evolution Machine Learning

Quantifying the accuracy of ancestral state prediction in a phylogenetic tree under maximum parsimony

122 - Lina Herbst , Thomas Li , Mike Steel 2018

In phylogenetic studies, biologists often wish to estimate the ancestral discrete character state at an interior vertex $v$ of an evolutionary tree $T$ from the states that are observed at the leaves of the tree. A simple and fast estimation method --- maximum parsimony --- takes the ancestral state at $v$ to be any state that minimises the number of state changes in $T$ required to explain its evolution on $T$. In this paper, we investigate the reconstruction accuracy of this estimation method further, under a simple symmetric model of state change, and obtain a number of new results, both for 2-state characters, and $r$--state characters ($r>2$). Our results rely on establishing new identities and inequalities, based on a coupling argument that involves a simpler `coin toss approach to ancestral state reconstruction.

Populations and Evolution

comments

Fetching comments

Higher Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

19 dubious ways to compute the marginal likelihood of a phylogenetic tree topology

Ask ChatGPT about the research

No Arabic abstract

Read More