Do you want to publish a course? Click here

On the Asymptotic Distributions of Classes of Subtree Additive Properties of Plane Trees under the Nearest Neighbor Thermodynamic Model

51   0   0.0 ( 0 )
 Added by Anna Kirkpatrick
 Publication date 2021
  fields
and research's language is English




Ask ChatGPT about the research

We define a class of properties on random plane trees, which we call subtree additive properties, inspired by the combinatorics of certain biologically-interesting properties in a plane tree model of RNA secondary structure. The class of subtree additive properties includes the Wiener index and path length (total ladder distance and total ladder contact distance, respectively, in the biological context). We then investigate the asymptotic distribution of these subtree additive properties on a random plane tree distributed according to a Gibbs distribution arising from the Nearest Neighbor Thermodynamic Model for RNA secondary structure. We show that for any property in the class considered, there is a constant that translates the uniformly weighted random variable to the Gibbs distribution weighted random variable (and we provide the constant). We also relate the asymptotic distribution of another class of properties, which we call simple subtree additive properties, to the asymptotic distribution of the path length, both in the uniformly weighted case. The primary proof techniques in this paper come from analytic combinatorics, and most of our results follow from relating the moments of known and unknown distributions and showing that this is sufficient for convergence.



rate research

Read More

A subtree of a tree is any induced subgraph that is again a tree (i.e., connected). The mean subtree order of a tree is the average number of vertices of its subtrees. This invariant was first analyzed in the 1980s by Jamison. An intriguing open question raised by Jamison asks whether the maximum of the mean subtree order, given the order of the tree, is always attained by some caterpillar. While we do not completely resolve this conjecture, we find some evidence in its favor by proving different features of trees that attain the maximum. For example, we show that the diameter of a tree of order $n$ with maximum mean subtree order must be very close to $n$. Moreover, we show that the maximum mean subtree order is equal to $n - 2log_2 n + O(1)$. For the local mean subtree order, which is the average order of all subtrees containing a fixed vertex, we can be even more precise: we show that its maximum is always attained by a broom and that it is equal to $n - log_2 n + O(1)$.
Among many topological indices of trees the sum of distances $sigma(T)$ and the number of subtrees $F(T)$ have been a long standing pair of graph invariants that are well known for their negative correlation. That is, among various given classes of trees, the extremal structures maximizing one usually minimize the other, and vice versa. By introducing the local
362 - Andrew Ying , Wen-Xin Zhou 2019
We investigate the asymptotic behavior of several variants of the scan statistic applied to empirical distributions, which can be applied to detect the presence of an anomalous interval with any length. Of particular interest is Studentized scan statistic that is preferable in practice. The main ingredients in the proof are Kolmogorovs theorem, a Poisson approximation, and recent technical results by Kabluchko et al (2014).
156 - Junfeng He 2012
Fast approximate nearest neighbor (NN) search in large databases is becoming popular. Several powerful learning-based formulations have been proposed recently. However, not much attention has been paid to a more fundamental question: how difficult is (approximate) nearest neighbor search in a given data set? And which data properties affect the difficulty of nearest neighbor search and how? This paper introduces the first concrete measure called Relative Contrast that can be used to evaluate the influence of several crucial data characteristics such as dimensionality, sparsity, and database size simultaneously in arbitrary normed metric spaces. Moreover, we present a theoretical analysis to prove how the difficulty measure (relative contrast) determines/affects the complexity of Local Sensitive Hashing, a popular approximate NN search method. Relative contrast also provides an explanation for a family of heuristic hashing algorithms with good practical performance based on PCA. Finally, we show that most of the previous works in measuring NN search meaningfulness/difficulty can be derived as special asymptotic cases for dense vectors of the proposed measure.
Suppose $V$ is an $n$-element set where for each $x in V$, the elements of $V setminus {x}$ are ranked by their similarity to $x$. The $K$-nearest neighbor graph is a directed graph including an arc from each $x$ to the $K$ points of $V setminus {x}$ most similar to $x$. Constructive approximation to this graph using far fewer than $n^2$ comparisons is important for the analysis of large high-dimensional data sets. $K$-Nearest Neighbor Descent is a parameter-free heuristic where a sequence of graph approximations is constructed, in which second neighbors in one approximation are proposed as neighbors in the next. Run times in a test case fit an $O(n K^2 log{n})$ pattern. This bound is rigorously justified for a similar algorithm, using range queries, when applied to a homogeneous Poisson process in suitable dimension. However the basic algorithm fails to achieve subquadratic complexity on sets whose similarity rankings arise from a ``generic linear order on the $binom{n}{2}$ inter-point distances in a metric space.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا