No Arabic abstract
Consensus methods are widely used for combining phylogenetic trees into a single estimate of the evolutionary tree for a group of species. As more taxa are added, the new source trees may begin to tell a different evolutionary story when restricted to the original set of taxa. However, if the new trees, restricted to the original set of taxa, were to agree exactly with the earlier trees, then we might hope that their consensus would either agree with or resolve the original consensus tree. In this paper, we ask under what conditions consensus methods exist that are future proof in this sense. While we show that some methods (e.g. Adams consensus) have this property for specific types of input, we also establish a rather surprising `no-go theorem: there is no reasonable consensus method that satisfies the future-proofing property in general. We then investigate a second notion of future proofing for consensus methods, in which trees (rather than taxa) are added, and establish some positive and negative results. We end with some questions for future work.
In a recent study, Bryant, Francis and Steel investigated the concept of enquote{future-proofing} consensus methods in phylogenetics. That is, they investigated if such methods can be robust against the introduction of additional data like extra trees or new species. In the present manuscript, we analyze consensus methods under a different aspect of introducing new data, namely concerning the discovery of new clades. In evolutionary biology, often formerly unresolved clades get resolved by refined reconstruction methods or new genetic data analyses. In our manuscript we investigate which properties of consensus methods can guarantee that such new insights do not disagree with previously found consensus trees but merely refine them. We call consensus methods with this property emph{refinement-stable}. Along these lines, we also study two famous super tree methods, namely Matrix Representation with Parsimony (MRP) and Matrix Representation with Compatibility (MRC), which have also been suggested as consensus methods in the literature. While we (just like Bryant, Francis and Steel in their recent study) unfortunately have to conclude some negative answers concerning general consensus methods, we also state some relevant and positive results concerning the majority rule (MR) and strict consensus methods, which are amongst the most frequently used consensus methods. Moreover, we show that there exist infinitely many consensus methods which are refinement-stable and have some other desirable properties.
Given a gene tree and a species tree, ancestral configurations represent the combinatorially distinct sets of gene lineages that can reach a given node of the species tree. They have been introduced as a data structure for use in the recursive computation of the conditional probability under the multispecies coalescent model of a gene tree topology given a species tree, the cost of this computation being affected by the number of ancestral configurations of the gene tree in the species tree. For matching gene trees and species trees, we obtain enumerative results on ancestral configurations. We study ancestral configurations in balanced and unbalanced families of trees determined by a given seed tree, showing that for seed trees with more than one taxon, the number of ancestral configurations increases for both families exponentially in the number of taxa $n$. For fixed $n$, the maximal number of ancestral configurations tabulated at the species tree root node and the largest number of labeled histories possible for a labeled topology occur for trees with precisely the same unlabeled shape. For ancestral configurations at the root, the maximum increases with $k_0^n$, where $k_0 approx 1.5028$ is a quadratic recurrence constant. Under a uniform distribution over the set of labeled trees of given size, the mean number of root ancestral configurations grows with $sqrt{3/2}(4/3)^n$ and the variance with approximately $1.4048(1.8215)^n$. The results provide a contribution to the combinatorial study of gene trees and species trees.
The Minimal Ancestral Deviation (MAD) method is a recently introduced procedure for estimating the root of a phylogenetic tree, based only on the shape and branch lengths of the tree. The method is loosely derived from the midpoint rooting method, but, unlike its predecessor, makes use of all pairs of OTUs when positioning the root. In this note we establish properties of this method and then describe a fast and memory efficient algorithm. As a proof of principle, we use our algorithm to determine the MAD roots for simulated phylogenies with up to 100,000 OTUs. The calculations take a few minutes on a standard laptop.
We present a computational model to reconstruct trees of ancestors for animals with sexual reproduction. Through a recursive algorithm combined with a random number generator, it is possible to reproduce the number of ancestors for each generation and use it to constraint the maximum number of the following generation. This new model allows to consider the reproductive preferences of particular species and combine several trees to simulate the behavior of a population. It is also possible to obtain a description analytically, considering the simulation as a theoretical stochastic process. Such process can be generalized in order to use an algorithm associated with it to simulate other similar processes of stochastic nature. The simulation is based in the theoretical model previously presented before.
The promotion of cooperation on spatial lattices is an important issue in evolutionary game theory. This effect clearly depends on the update rule: it diminishes with stochastic imitative rules whereas it increases with unconditional imitation. To study the transition between both regimes, we propose a new evolutionary rule, which stochastically combines unconditional imitation with another imitative rule. We find that, surprinsingly, in many social dilemmas this rule yields higher cooperative levels than any of the two original ones. This nontrivial effect occurs because the basic rules induce a separation of timescales in the microscopic processes at cluster interfaces. The result is robust in the space of 2x2 symmetric games, on regular lattices and on scale-free networks.