No Arabic abstract
Measures of tree balance play an important role in various research areas, for example in phylogenetics. There they are for instance used to test whether an observed phylogenetic tree differs significantly from a tree generated by the Yule model of speciation. One of the most popular indices in this regard is the Colless index, which measures the degree of balance for rooted binary trees. While many statistical properties of the Colless index (e.g. asymptotic results for its mean and variance under different models of speciation) have already been discussed in different contexts, we focus on its extremal properties. While it is relatively straightforward to characterize trees with maximal Colless index, the analysis of the minimal value of the Colless index and the characterization of trees that achieve it, are much more involved. In this note, we therefore focus on the minimal value of the Colless index for any given number of leaves. We derive both a recursive formula for this minimal value, as well as an explicit expression, which shows a surprising connection between the Colless index and the so-called Blancmange curve, a fractal curve that is also known as the Takagi curve. Moreover, we characterize two classes of trees that have minimal Colless index, consisting of the set of so-called emph{maximally balanced trees} and a class of trees that we call emph{greedy from the bottom trees}. Furthermore, we derive an upper bound for the number of trees with minimal Colless index by relating these trees with trees with minimal Sackin index (another well-studied index of tree balance).
We introduce some natural families of distributions on rooted binary ranked plane trees with a view toward unifying ideas from various fields, including macroevolution, epidemiology, computational group theory, search algorithms and other fields. In the process we introduce the notions of split-exchangeability and plane-invariance of a general Markov splitting model in order to readily obtain probabilities over various equivalence classes of trees that arise in statistics, phylogenetics, epidemiology and group theory.
Effects like selection in evolution as well as fertility inheritance in the development of populations can lead to a higher degree of asymmetry in evolutionary trees than expected under a null hypothesis. To identify and quantify such influences, various balance indices were proposed in the phylogenetic literature and have been in use for decades. However, so far no balance index was based on the number of emph{symmetry nodes}, even though symmetry nodes play an important role in other areas of mathematical phylogenetics and despite the fact that symmetry nodes are a quite natural way to measure balance or symmetry of a given tree. The aim of this manuscript is thus twofold: First, we will introduce the emph{symmetry nodes index} as an index for measuring balance of phylogenetic trees and analyze its extremal properties. We also show that this index can be calculated in linear time. This new index turns out to be a generalization of a simple and well-known balance index, namely the emph{cherry index}, as well as a specialization of another, less established, balance index, namely emph{Rogers $J$ index}. Thus, it is the second objective of the present manuscript to compare the new symmetry nodes index to these two indices and to underline its advantages. In order to do so, we will derive some extremal properties of the cherry index and Rogers $J$ index along the way and thus complement existing studies on these indices. Moreover, we used the programming language textsf{R} to implement all three indices in the software package textsf{symmeTree}, which has been made publicly available.
Among many topological indices of trees the sum of distances $sigma(T)$ and the number of subtrees $F(T)$ have been a long standing pair of graph invariants that are well known for their negative correlation. That is, among various given classes of trees, the extremal structures maximizing one usually minimize the other, and vice versa. By introducing the local
The Perron value $rho(T)$ of a rooted tree $T$ has a central role in the study of the algebraic connectivity and characteristic set, and it can be considered a weight of spectral nature for $T$. A different, combinatorial weight notion for $T$ - the moment $mu(T)$ - emerges from the analysis of Kemenys constant in the context of random walks on graphs. In the present work, we compare these two weight concepts showing that $mu(T)$ is almost an upper bound for $rho(T)$ and the ratio $mu(T)/rho(T)$ is unbounded but at most linear in the order of $T$. To achieve these primary goals, we introduce two new objects associated with $T$ - the Perron entropy and the neckbottle matrix - and we investigate how different operations on the set of rooted trees affect the Perron value and the moment.
Let $T_{n}$ be the set of rooted labeled trees on $set{0,...,n}$. A maximal decreasing subtree of a rooted labeled tree is defined by the maximal subtree from the root with all edges being decreasing. In this paper, we study a new refinement $T_{n,k}$ of $T_n$, which is the set of rooted labeled trees whose maximal decreasing subtree has $k+1$ vertices.