Lagrangian Relaxation for MAP Estimation in Graphical Models

408 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jason Johnson

تاريخ النشر 2007

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Jason K. Johnson - Dmitry M. Malioutov - Alan S. Willsky

الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We develop a general framework for MAP estimation in discrete and Gaussian graphical models using Lagrangian relaxation techniques. The key idea is to reformulate an intractable estimation problem as one defined on a more tractable graph, but subject to additional constraints. Relaxing these constraints gives a tractable dual problem, one defined by a thin graph, which is then optimized by an iterative procedure. When this iterative optimization leads to a consistent estimate, one which also satisfies the constraints, then it corresponds to an optimal MAP estimate of the original model. Otherwise there is a ``duality gap, and we obtain a bound on the optimal solution. Thus, our approach combines convex optimization with dynamic programming techniques applicable for thin graphs. The popular tree-reweighted max-product (TRMP) method may be seen as solving a particular class of such relaxations, where the intractable graph is relaxed to a set of spanning trees. We also consider relaxations to a set of small induced subgraphs, thin subgraphs (e.g. loops), and a connected tree obtained by ``unwinding cycles. In addition, we propose a new class of multiscale relaxations that introduce ``summary variables. The potential benefits of such generalizations include: reducing or eliminating the ``duality gap in hard problems, reducing the number or Lagrange multipliers in the dual problem, and accelerating convergence of the iterative optimization procedure.

قيم البحث

106 - Andrew J Sedgewick , Joseph D. Ramsey , Peter Spirtes 2017

Graphical causal models are an important tool for knowledge discovery because they can represent both the causal relations between variables and the multivariate probability distributions over the data. Once learned, causal graphs can be used for cla ssification, feature selection and hypothesis generation, while revealing the underlying causal network structure and thus allowing for arbitrary likelihood queries over the data. However, current algorithms for learning sparse directed graphs are generally designed to handle only one type of data (continuous-only or discrete-only), which limits their applicability to a large class of multi-modal biological datasets that include mixed type variables. To address this issue, we developed new methods that modify and combine existing methods for finding undirected graphs with methods for finding directed graphs. These hybrid methods are not only faster, but also perform better than the directed graph estimation methods alone for a variety of parameter settings and data set sizes. Here, we describe a new conditional independence test for learning directed graphs over mixed data types and we compare performances of different graph learning strategies on synthetic data.

الذكاء الاصطناعي التعلم الالي

Parameter Estimation for Undirected Graphical Models with Hard Constraints

70 - Bhaswar B. Bhattacharya , Kavita Ramanan 2020

The hardcore model on a graph $G$ with parameter $lambda>0$ is a probability measure on the collection of all independent sets of $G$, that assigns to each independent set $I$ a probability proportional to $lambda^{|I|}$. In this paper we consider th e problem of estimating the parameter $lambda$ given a single sample from the hardcore model on a graph $G$. To bypass the computational intractability of the maximum likelihood method, we use the maximum pseudo-likelihood (MPL) estimator, which for the hardcore model has a surprisingly simple closed form expression. We show that for any sequence of graphs ${G_N}_{Ngeq 1}$, where $G_N$ is a graph on $N$ vertices, the MPL estimate of $lambda$ is $sqrt N$-consistent, whenever the graph sequence has uniformly bounded average degree. We then derive sufficient conditions under which the MPL estimate of the activity parameters is $sqrt N$-consistent given a single sample from a general $H$-coloring model, in which restrictions between adjacent colors are encoded by a constraint graph $H$. We verify the sufficient conditions for models where there is at least one unconstrained color as long as the graph sequence has uniformly bounded average degree. This applies to many $H$-coloring examples such as the Widom-Rowlinson and multi-state hard-core models. On the other hand, for the $q$-coloring model, which falls outside this class, we show that consistent estimation may be impossible even for graphs with bounded average degree. Nevertheless, we show that the MPL estimate is $sqrt N$-consistent in the $q$-coloring model when ${G_N}_{Ngeq 1}$ has bounded average double neighborhood. The presence of hard constraints, as opposed to soft constraints, leads to new challenges, and our proofs entail applications of the method of exchangeable pairs as well as combinatorial arguments that employ the probabilistic method.

الاحتمالات التوافقية نظرية الإحصاء

A Concise Function Representation for Faster Exact MPE and Constrained Optimisation in Graphical Models

70 - Filippo Bistaffa 2021

We propose a novel concise function representation for graphical models, a central theoretical framework that provides the basis for many reasoning tasks. We then show how we exploit our concise representation based on deterministic finite state auto mata within Bucket Elimination (BE), a general approach based on the concept of variable elimination that accommodates many inference and optimisation tasks such as most probable explanation and constrained optimisation. We denote our version of BE as FABE. By using our concise representation within FABE, we dramatically improve the performance of BE in terms of runtime and memory requirements. Results on standard benchmarks obtained using an established experimental methodology show that FABE often outperforms the best available approach (RBFAOO), leading to significant runtime improvements (up to 2 orders of magnitude in our tests).

الذكاء الاصطناعي

RockIt: Exploiting Parallelism and Symmetry for MAP Inference in Statistical Relational Models

542 - Jan Noessner , Mathias Niepert , Heiner Stuckenschmidt 2013

RockIt is a maximum a-posteriori (MAP) query engine for statistical relational models. MAP inference in graphical models is an optimization problem which can be compiled to integer linear programs (ILPs). We describe several advances in translating M AP queries to ILP instances and present the novel meta-algorithm cutting plane aggregation (CPA). CPA exploits local context-specific symmetries and bundles up sets of linear constraints. The resulting counting constraints lead to more compact ILPs and make the symmetry of the ground model more explicit to state-of-the-art ILP solvers. Moreover, RockIt parallelizes most parts of the MAP inference pipeline taking advantage of ubiquitous shared-memory multi-core architectures. We report on extensive experiments with Markov logic network (MLN) benchmarks showing that RockIt outperforms the state-of-the-art systems Alchemy, Markov TheBeast, and Tuffy both in terms of efficiency and quality of results.

الذكاء الاصطناعي

Asymptotic normality and optimalities in estimation of large Gaussian graphical models

466 - Zhao Ren , Tingni Sun , Cun-Hui Zhang 2013

The Gaussian graphical model, a popular paradigm for studying relationship among variables in a wide range of applications, has attracted great attention in recent years. This paper considers a fundamental question: When is it possible to estimate lo w-dimensional parameters at parametric square-root rate in a large Gaussian graphical model? A novel regression approach is proposed to obtain asymptotically efficient estimation of each entry of a precision matrix under a sparseness condition relative to the sample size. When the precision matrix is not sufficiently sparse, or equivalently the sample size is not sufficiently large, a lower bound is established to show that it is no longer possible to achieve the parametric rate in the estimation of each entry. This lower bound result, which provides an answer to the delicate sample size question, is established with a novel construction of a subset of sparse precision matrices in an application of Le Cams lemma. Moreover, the proposed estimator is proven to have optimal convergence rate when the parametric rate cannot be achieved, under a minimal sample requirement. The proposed estimator is applied to test the presence of an edge in the Gaussian graphical model or to recover the support of the entire model, to obtain adaptive rate-optimal estimation of the entire precision matrix as measured by the matrix $ell_q$ operator norm and to make inference in latent variables in the graphical model. All of this is achieved under a sparsity condition on the precision matrix and a side condition on the range of its spectrum. This significantly relaxes the commonly imposed uniform signal strength condition on the precision matrix, irrepresentability condition on the Hessian tensor operator of the covariance matrix or the $ell_1$ constraint on the precision matrix. Numerical results confirm our theoretical findings. The ROC curve of the proposed algorithm, Asymptotic Normal Thresholding (ANT), for support recovery significantly outperforms that of the popular GLasso algorithm.

نظرية الإحصاء المنهجية التعلم الالي