أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Lina Herbst

On the minimum value of the Colless index and the bifurcating trees that achieve it

104 - Tomas M. Coronado , Mareike Fischer , Lina Herbst 2019

Measures of tree balance play an important role in the analysis of phylogenetic trees. One of the oldest and most popular indices in this regard is the Colless index for rooted bifurcating trees, introduced by Colless (1982). While many of its statis tical properties under different probabilistic models for phylogenetic trees have already been established, little is known about its minimum value and the trees that achieve it. In this manuscript, we fill this gap in the literature. To begin with, we derive both recursive and closed expressions for the minimum Colless index of a tree with $n$ leaves. Surprisingly, these expressions show a connection between the minimum Colless index and the so-called Blancmange curve, a fractal curve. We then fully characterize the tree shapes that achieve this minimum value and we introduce both an algorithm to generate them and a recurrence to count them. After focusing on two extremal classes of trees with minimum Colless index (the maximally balanced trees and the greedy from the bottom trees), we conclude by showing that all trees with minimum Colless index also have minimum Sackin index, another popular balance index.

السكان والتطور الرياضيات المتقطعة التوافقية

Extremal properties of the Colless balance index for rooted binary trees

84 - Mareike Fischer , Lina Herbst , Kristina Wicke 2019

Measures of tree balance play an important role in various research areas, for example in phylogenetics. There they are for instance used to test whether an observed phylogenetic tree differs significantly from a tree generated by the Yule model of s peciation. One of the most popular indices in this regard is the Colless index, which measures the degree of balance for rooted binary trees. While many statistical properties of the Colless index (e.g. asymptotic results for its mean and variance under different models of speciation) have already been discussed in different contexts, we focus on its extremal properties. While it is relatively straightforward to characterize trees with maximal Colless index, the analysis of the minimal value of the Colless index and the characterization of trees that achieve it, are much more involved. In this note, we therefore focus on the minimal value of the Colless index for any given number of leaves. We derive both a recursive formula for this minimal value, as well as an explicit expression, which shows a surprising connection between the Colless index and the so-called Blancmange curve, a fractal curve that is also known as the Takagi curve. Moreover, we characterize two classes of trees that have minimal Colless index, consisting of the set of so-called emph{maximally balanced trees} and a class of trees that we call emph{greedy from the bottom trees}. Furthermore, we derive an upper bound for the number of trees with minimal Colless index by relating these trees with trees with minimal Sackin index (another well-studied index of tree balance).

التوافقية السكان والتطور

Unrooted non-binary tree-based phylogenetic networks

84 - Mareike Fischer , Michelle Galla , Lina Herbst 2018

Phylogenetic networks are a generalization of phylogenetic trees allowing for the representation of non-treelike evolutionary events such as hybridization. Typically, such networks have been analyzed based on their `level, i.e. based on the complexit y of their 2-edge-connected components. However, recently the question of how `treelike a phylogenetic network is has become the center of attention in various studies. This led to the introduction of emph{tree-based networks}, i.e. networks that can be constructed from a phylogenetic tree, called the emph{base tree}, by adding additional edges. While the concept of tree-basedness was originally introduced for rooted phylogenetic networks, it has recently also been considered for unrooted networks. In the present study, we compare and contrast findings obtained for unrooted emph{binary} tree-based networks to unrooted emph{non-binary} networks. In particular, while it is known that up to level 4 all unrooted binary networks are tree-based, we show that in the case of non-binary networks, this result only holds up to level 3.

السكان والتطور التوافقية

Classes of treebased networks

54 - Mareike Fischer , Michelle Galla , Lina Herbst 2018

Recently, so-called treebased phylogenetic networks have gained considerable interest in the literature, where a treebased network is a network that can be constructed from a phylogenetic tree, called the base tree, by adding additional edges. The ma in aim of this manuscript is to provide some sufficient criteria for treebasedness by reducing phylogenetic networks to related graph structures. While it is generally known that deciding whether a network is treebased is NP-complete, one of these criteria, namely edgebasedness, can be verified in linear time. Surprisingly, the class of edgebased networks is closely related to a well-known family of graphs, namely the class of generalized series parallel graphs, and we will explore this relationship in full detail. Additionally, we introduce further classes of treebased networks and analyze their relationships.

السكان والتطور التوافقية

Quantifying the accuracy of ancestral state prediction in a phylogenetic tree under maximum parsimony

122 - Lina Herbst , Thomas Li , Mike Steel 2018

In phylogenetic studies, biologists often wish to estimate the ancestral discrete character state at an interior vertex $v$ of an evolutionary tree $T$ from the states that are observed at the leaves of the tree. A simple and fast estimation method - -- maximum parsimony --- takes the ancestral state at $v$ to be any state that minimises the number of state changes in $T$ required to explain its evolution on $T$. In this paper, we investigate the reconstruction accuracy of this estimation method further, under a simple symmetric model of state change, and obtain a number of new results, both for 2-state characters, and $r$--state characters ($r>2$). Our results rely on establishing new identities and inequalities, based on a coupling argument that involves a simpler `coin toss approach to ancestral state reconstruction.

السكان والتطور

On the accuracy of ancestral sequence reconstruction for ultrametric trees with parsimony

80 - Lina Herbst , Mareike Fischer 2017

We examine a mathematical question concerning the reconstruction accuracy of the Fitch algorithm for reconstructing the ancestral sequence of the most recent common ancestor given a phylogenetic tree and sequence data for all taxa under consideration . In particular, for the symmetric 4-state substitution model which is also known as Jukes-Cantor model, we answer affirmatively a conjecture of Li, Steel and Zhang which states that for any ultrametric phylogenetic tree and a symmetric model, the Fitch parsimony method using all terminal taxa is more accurate, or at least as accurate, for ancestral state reconstruction than using any particular terminal taxon or any particular pair of taxa. This conjecture had so far only been answered for two-state data by Fischer and Thatte. Here, we focus on answering the biologically more relevant case with four states, which corresponds to ancestral sequence reconstruction from DNA or RNA data.

السكان والتطور التوافقية الاحتمالات

Ancestral sequence reconstruction with Maximum Parsimony

100 - Lina Herbst , Mareike Fischer 2017

One of the main aims in phylogenetics is the estimation of ancestral sequences based on present-day data like, for instance, DNA alignments. One way to estimate the data of the last common ancestor of a given set of species is to first reconstruct a phylogenetic tree with some tree inference method and then to use some method of ancestral state inference based on that tree. One of the best-known methods both for tree inference as well as for ancestral sequence inference is Maximum Parsimony (MP). In this manuscript, we focus on this method and on ancestral state inference for fully bifurcating trees. In particular, we investigate a conjecture published by Charleston and Steel in 1995 concerning the number of species which need to have a particular state, say $a$, at a particular site in order for MP to unambiguously return $a$ as an estimate for the state of the last common ancestor. We prove the conjecture for all even numbers of character states, which is the most relevant case in biology. We also show that the conjecture does not hold in general for odd numbers of character states, but also present some positive results for this case.

السكان والتطور التوافقية

The most parsimonious tree for random data

136 - Mareike Fischer , Michelle Galla , Lina Herbst 2014

Applying a method to reconstruct a phylogenetic tree from random data provides a way to detect whether that method has an inherent bias towards certain tree `shapes. For maximum parsimony, applied to a sequence of random 2-state data, each possible b inary phylogenetic tree has exactly the same distribution for its parsimony score. Despite this pleasing and slightly surprising symmetry, some binary phylogenetic trees are more likely than others to be a most parsimonious (MP) tree for a sequence of $k$ such characters, as we show. For $k=2$, and unrooted binary trees on six taxa, any tree with a caterpillar shape has a higher chance of being an MP tree than any tree with a symmetric shape. On the other hand, if we take any two binary trees, on any number of taxa, we prove that this bias between the two trees vanishes as the number of characters grows. However, again there is a twist: MP trees on six taxa are more likely to have certain shapes than a uniform distribution on binary phylogenetic trees predicts, and this difference does not appear to dissipate as $k$ grows.

السكان والتطور

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد