The maximum agreement subtree problem


Abstract in English

In this paper we investigate an extremal problem on binary phylogenetic trees. Given two such trees $T_1$ and $T_2$, both with leaf-set ${1,2,...,n}$, we are interested in the size of the largest subset $S subseteq {1,2,...,n}$ of leaves in a common subtree of $T_1$ and $T_2$. We show that any two binary phylogenetic trees have a common subtree on $Omega(sqrt{log{n}})$ leaves, thus improving on the previously known bound of $Omega(loglog n)$ due to M. Steel and L. Szekely. To achieve this improved bound, we first consider two special cases of the problem: when one of the trees is balanced or a caterpillar, we show that the largest common subtree has $Omega(log n)$ leaves. We then handle the general case by proving and applying a Ramsey-type result: that every binary tree contains either a large balanced subtree or a large caterpillar. We also show that there are constants $c, alpha > 0$ such that, when both trees are balanced, they have a common subtree on $c n^alpha$ leaves. We conjecture that it is possible to take $alpha = 1/2$ in the unrooted case, and both $c = 1$ and $alpha = 1/2$ in the rooted case.

Download