ترغب بنشر مسار تعليمي؟ اضغط هنا

The most parsimonious tree for random data

245   0   0.0 ( 0 )
 نشر من قبل Mike Steel Prof.
 تاريخ النشر 2014
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Applying a method to reconstruct a phylogenetic tree from random data provides a way to detect whether that method has an inherent bias towards certain tree `shapes. For maximum parsimony, applied to a sequence of random 2-state data, each possible binary phylogenetic tree has exactly the same distribution for its parsimony score. Despite this pleasing and slightly surprising symmetry, some binary phylogenetic trees are more likely than others to be a most parsimonious (MP) tree for a sequence of $k$ such characters, as we show. For $k=2$, and unrooted binary trees on six taxa, any tree with a caterpillar shape has a higher chance of being an MP tree than any tree with a symmetric shape. On the other hand, if we take any two binary trees, on any number of taxa, we prove that this bias between the two trees vanishes as the number of characters grows. However, again there is a twist: MP trees on six taxa are more likely to have certain shapes than a uniform distribution on binary phylogenetic trees predicts, and this difference does not appear to dissipate as $k$ grows.



قيم البحث

اقرأ أيضاً

Stochastic models of evolution (Markov random fields on trivalent trees) generally assume that different characters (different runs of the stochastic process) are independent and identically distributed. In this paper we take the first steps towards addressing dependent characters. Specifically we show that, under certain technical assumptions regarding the evolution of individual characters, we can detect any significant, history independent, correlation between any pair of multistate characters. For the special case of the Cavender-Farris-Neyman (CFN) model on two states with symmetric transition matrices, our analysis needs milder assumptions. To perform the analysis, we need to prove a new concentration result for multistate random variables of a Markov random field on arbitrary trivalent trees: we show that the random variable counting the number of leaves in any particular subset of states has variance that is subquadratic in the number of leaves.
One of the first beings affected by changes in the climate are trees, one of our most vital resources. In this study tree species interaction and the response to climate in different ecological environments is observed by applying a joint species dis tribution model to different ecological domains in the United States. Joint species distribution models are useful to learn inter-species relationships and species response to the environment. The climates impact on the tree species is measured through species abundance in an area. We compare the models performance across all ecological domains and study the sensitivity of the climate variables. With the prediction of abundances, tree species populations can be predicted in the future and measure the impact of climate change on tree populations.
111 - David Bryant , Mike Steel 2008
The Robinson-Foulds (RF) distance is by far the most widely used measure of dissimilarity between trees. Although the distribution of these distances has been investigated for twenty years, an algorithm that is explicitly polynomial time has yet to b e described for computing this distribution (which is also the distribution of trees around a given tree under the popular Robinson-Foulds metric). In this paper we derive a polynomial-time algorithm for this distribution. We show how the distribution can be approximated by a Poisson distribution determined by the proportion of leaves that lie in `cherries of the given tree. We also describe how our results can be used to derive normalization constants that are required in a recently-proposed maximum likelihood approach to supertree construction.
Phylogenetic networks are generalizations of phylogenetic trees that allow the representation of reticulation events such as horizontal gene transfer or hybridization, and can also represent uncertainty in inference. A subclass of these, tree-based p hylogenetic networks, have been introduced to capture the extent to which reticulate evolution nevertheless broadly follows tree-like patterns. Several important operations that change a general phylogenetic network have been developed in recent years, and are important for allowing algorithms to move around spaces of networks; a vital ingredient in finding an optimal network given some biological data. A key such operation is the Nearest Neighbor Interchange, or NNI. While it is already known that the space of unrooted phylogenetic networks is connected under NNI, it has been unclear whether this also holds for the subspace of tree-based networks. In this paper we show that the space of unrooted tree-based phylogenetic networks is indeed connected under the NNI operation. We do so by explicitly showing how to get from one such network to another one without losing tree-basedness along the way. Moreover, we introduce some new concepts, for instance ``shoat networks, and derive some interesting aspects concerning tree-basedness. Last, we use our results to derive an upper bound on the size of the space of tree-based networks.
166 - A. Duarte , R. Fraiman , A. Galves 2016
It has been repeatedly conjectured that the brain retrieves statistical regularities from stimuli. Here we present a new statistical approach allowing to address this conjecture. This approach is based on a new class of stochastic processes driven by chains with memory of variable length. It leads to a new experimental protocol in which sequences of auditory stimuli generated by a stochastic chain are presented to volunteers while electroencephalographic (EEG) data is recorded from their scalp. A new statistical model selection procedure for functional data is introduced and proved to be consistent. Applied to samples of EEG data collected using our experimental protocol it produces results supporting the conjecture that the brain effectively identifies the structure of the chain generating the sequence of stimuli.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا