Efficient Learning of Optimal Markov Network Topology with k-Tree Modeling

88 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Liang Ding

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Liang Ding - Di Chang - Russell Malmberg

بنى وهياكل البيانات والخوارزميات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The seminal work of Chow and Liu (1968) shows that approximation of a finite probabilistic system by Markov trees can achieve the minimum information loss with the topology of a maximum spanning tree. Our current paper generalizes the result to Markov networks of tree width $leq k$, for every fixed $kgeq 2$. In particular, we prove that approximation of a finite probabilistic system with such Markov networks has the minimum information loss when the network topology is achieved with a maximum spanning $k$-tree. While constructing a maximum spanning $k$-tree is intractable for even $k=2$, we show that polynomial algorithms can be ensured by a sufficient condition accommodated by many meaningful applications. In particular, we prove an efficient algorithm for learning the optimal topology of higher order correlations among random variables that belong to an underlying linear structure.

قيم البحث

83 - Aleksander Figiel , Leon Kellerhals , Rolf Niedermeier 2021

The performance of distributed and data-centric applications often critically depends on the interconnecting network. Applications are hence modeled as virtual networks, also accounting for resource demands on links. At the heart of provisioning such virtual networks lies the NP-hard Virtual Network Embedding Problem (VNEP): how to jointly map the virtual nodes and links onto a physical substrate network at minimum cost while obeying capacities. This paper studies the VNEP in the light of parameterized complexity. We focus on tree topology substrates, a case often encountered in practice and for which the VNEP remains NP-hard. We provide the first fixed-parameter algorithm for the VNEP with running time $O(3^r (s+r^2))$ for requests and substrates of $r$ and $s$ nodes, respectively. In a computational study our algorithm yields running time improvements in excess of 200x compared to state-of-the-art integer programming approaches. This makes it comparable in speed to the well-established ViNE heuristic while providing optimal solutions. We complement our algorithmic study with hardness results for the VNEP and related problems.

بنى وهياكل البيانات والخوارزميات بنية الشبكات والإنترنت

Optimal Tree Topology for a Submarine Cable Network With Constrained Internodal Latency

324 - Tianjiao Wang , Zengfu Wang , Bill Moran 2020

This paper provides an optimized cable path planning solution for a tree-topology network in an irregular 2D manifold in a 3D Euclidean space, with an application to the planning of submarine cable networks. Our solution method is based on total cost minimization, where the individual cable costs are assumed to be linear to the length of the corresponding submarine cables subject to latency constraints between pairs of nodes. These latency constraints limit the cable length and number of hops between any pair of nodes. Our method combines the Fast Marching Method (FMM) and a new Integer Linear Programming (ILP) formulation for Minimum Spanning Tree (MST) where there are constraints between pairs of nodes. We note that this problem of MST with constraints is NP-complete. Nevertheless, we demonstrate that ILP running time is adequate for the great majority of existing cable systems. For cable systems for which ILP is not able to find the optimal solution within an acceptable time, we propose an alternative heuristic algorithm based on Prims algorithm. In addition, we apply our FMM/ILP-based algorithm to a real-world cable path planning example and demonstrate that it can effectively find an MST with latency constraints between pairs of nodes.

أنظمة وتحكم أنظمة وتحكم معالجة الإشارات

Dual-tree $k$-means with bounded iteration runtime

125 - Ryan R. Curtin 2016

k-means is a widely used clustering algorithm, but for $k$ clusters and a dataset size of $N$, each iteration of Lloyds algorithm costs $O(kN)$ time. Although there are existing techniques to accelerate single Lloyd iterations, none of these are tail ored to the case of large $k$, which is increasingly common as dataset sizes grow. We propose a dual-tree algorithm that gives the exact same results as standard $k$-means; when using cover trees, we use adaptive analysis techniques to, under some assumptions, bound the single-iteration runtime of the algorithm as $O(N + k log k)$. To our knowledge these are the first sub-$O(kN)$ bounds for exact Lloyd iterations. We then show that this theoretically favorable algorithm performs competitively in practice, especially for large $N$ and $k$ in low dimensions. Further, the algorithm is tree-independent, so any type of tree may be used.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Efficient tree-structured categorical retrieval

196 - Djamal Belazzougui , Gregory Kucherov 2020

We study a document retrieval problem in the new framework where $D$ text documents are organized in a {em category tree} with a pre-defined number $h$ of categories. This situation occurs e.g. with taxomonic trees in biology or subject classificatio n systems for scientific literature. Given a string pattern $p$ and a category (level in the category tree), we wish to efficiently retrieve the $t$ emph{categorical units} containing this pattern and belonging to the category. We propose several efficient solutions for this problem. One of them uses $n(logsigma(1+o(1))+log D+O(h)) + O(Delta)$ bits of space and $O(|p|+t)$ query time, where $n$ is the total length of the documents, $sigma$ the size of the alphabet used in the documents and $Delta$ is the total number of nodes in the category tree. Another solution uses $n(logsigma(1+o(1))+O(log D))+O(Delta)+O(Dlog n)$ bits of space and $O(|p|+tlog D)$ query time. We finally propose other solutions which are more space-efficient at the expense of a slight increase in query time.

بنى وهياكل البيانات والخوارزميات

An efficient sampling algorithm for difficult tree pairs

91 - Sean Cleary , Roland Maio 2020

It is an open question whether there exists a polynomial-time algorithm for computing the rotation distances between pairs of extended ordered binary trees. The problem of computing the rotation distance between an arbitrary pair of trees, (S, T), ca n be efficiently reduced to the problem of computing the rotation distance between a difficult pair of trees (S, T), where there is no known first step which is guaranteed to be the beginning of a minimal length path. Of interest, therefore, is how to sample such difficult pairs of trees of a fixed size. We show that it is possible to do so efficiently, and present such an algorithm that runs in time $O(n^4)$.

بنى وهياكل البيانات والخوارزميات