No Arabic abstract
In microbial ecology studies, the most commonly used ways of investigating alpha (within-sample) diversity are either to apply count-only measures such as Simpsons index to Operational Taxonomic Unit (OTU) groupings, or to use classical phylogenetic diversity (PD), which is not abundance-weighted. Although alpha diversity measures that use abundance information in a phylogenetic framework do exist, but are not widely used within the microbial ecology community. The performance of abundance-weighted phylogenetic diversity measures compared to classical discrete measures has not been explored, and the behavior of these measures under rarefaction (sub-sampling) is not yet clear. In this paper we compare the ability of various alpha diversity measures to distinguish between different community states in the human microbiome for three different data sets. We also present and compare a novel one-parameter family of alpha diversity measures, BWPD_theta, that interpolates between classical phylogenetic diversity (PD) and an abundance-weighted extension of PD. Additionally, we examine the sensitivity of these phylogenetic diversity measures to sampling, via computational experiments and by deriving a closed form solution for the expectation of phylogenetic quadratic entropy under re-sampling. In all three of the datasets considered, an abundance-weighted measure is the best differentiator between community states. OTU-based measures, on the other hand, are less effective in distinguishing community types. In addition, abundance-weighted phylogenetic diversity measures are less sensitive to differing sampling intensity than their unweighted counterparts. Based on these results we encourage the use of abundance-weighted phylogenetic diversity measures, especially for cases such as microbial ecology where species delimitation is difficult.
Microorganisms live in environments that inevitably fluctuate between mild and harsh conditions. As harsh conditions may cause extinctions, the rate at which fluctuations occur can shape microbial communities and their diversity, but we still lack an intuition on how. Here, we build a mathematical model describing two microbial species living in an environment where substrate supplies randomly switch between abundant and scarce. We then vary the rate of switching as well as different properties of the interacting species, and measure the probability of the weaker species driving the stronger one extinct. We find that this probability increases with the strength of demographic noise under harsh conditions and peaks at either low, high, or intermediate switching rates depending on both species ability to withstand the harsh environment. This complex relationship shows why finding patterns between environmental fluctuations and diversity has historically been difficult. In parameter ranges where the fittest species was most likely to be excluded, however, the beta diversity in larger communities also peaked. In sum, how environmental fluctuations affect interactions between a few species pairs predicts their effect on the beta diversity of the whole community.
We derive an invertible transform linking two widely used measures of species diversity: phylogenetic diversity and the expected proportions of segregating (non-constant) sites. We assume a bi-allelic, symmetric, finite site model of substitution. Like the Hadamard transform of Hendy and Penny, the transform can be expressed completely independent of the underlying phylogeny. Our results bridge work on diversity from two quite distinct scientific communities.
Phylogenetic diversity indices provide a formal way to apportion evolutionary heritage across species. Two natural diversity indices are Fair Proportion (FP) and Equal Splits (ES). FP is also called evolutionary distinctiveness and, for rooted trees, is identical to the Shapley Value (SV), which arises from cooperative game theory. In this paper, we investigate the extent to which FP and ES can differ, characterise tree shapes on which the indices are identical, and study the equivalence of FP and SV and its implications in more detail. We also define and investigate analogues of these indices on unrooted trees (where SV was originally defined), including an index that is closely related to the Pauplin representation of phylogenetic diversity.
Phylogenetic Diversity (PD) is a prominent quantitative measure of the biodiversity of a collection of present-day species (taxa). This measure is based on the evolutionary distance among the species in the collection. Loosely speaking, if $mathcal{T}$ is a rooted phylogenetic tree whose leaf set $X$ represents a set of species and whose edges have real-valued lengths (weights), then the PD score of a subset $S$ of $X$ is the sum of the weights of the edges of the minimal subtree of $mathcal{T}$ connecting the species in $S$. In this paper, we define several natural variants of the PD score for a subset of taxa which are related by a known rooted phylogenetic network. Under these variants, we explore, for a positive integer $k$, the computational complexity of determining the maximum PD score over all subsets of taxa of size $k$ when the input is restricted to different classes of rooted phylogenetic networks
Tree-based networks are a class of phylogenetic networks that attempt to formally capture what is meant by tree-like evolution. A given non-tree-based phylogenetic network, however, might appear to be very close to being tree-based, or very far. In this paper, we formalise the notion of proximity to tree-based for unrooted phylogenetic networks, with a range of proximity measures. These measures also provide characterisations of tree-based networks. One measure in particular, related to the nearest neighbour interchange operation, allows us to define the notion of tree-based rank. This provides a subclassification within the tree-based networks themselves, identifying those networks that are very tree-based. Finally, we prove results relating tree-based networks in the settings of rooted and unrooted phylogenetic networks, showing effectively that an unrooted network is tree-based if and only if it can be made a rooted tree-based network by rooting it and orienting the edges appropriately. This leads to a clarification of the contrasting decision problems for tree-based networks, which are polynomial in the rooted case but NP complete in the unrooted.