Graph cuts always find a global optimum for Potts models (with a catch)

69 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Hunter Lang

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Hunter Lang - David Sontag - Aravindan Vijayaraghavan

التعلم الالي الذكاء الاصطناعي بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We prove that the $alpha$-expansion algorithm for MAP inference always returns a globally optimal assignment for Markov Random Fields with Potts pairwise potentials, with a catch: the returned assignment is only guaranteed to be optimal for an instance within a small perturbation of the original problem instance. In other words, all local minima with respect to expansion moves are global minima to slightly perturb

قيم البحث

166 - Miroslav v{S}trupl , Francesco Faccio , Dylan R. Ashley 2021

Reward-Weighted Regression (RWR) belongs to a family of widely known iterative Reinforcement Learning algorithms based on the Expectation-Maximization framework. In this family, learning at each iteration consists of sampling a batch of trajectories using the current policy and fitting a new policy to maximize a return-weighted log-likelihood of actions. Although RWR is known to yield monotonic improvement of the policy under certain circumstances, whether and under which conditions RWR converges to the optimal policy have remained open questions. In this paper, we provide for the first time a proof that RWR converges to a global optimum when no function approximation is used.

التعلم الالي الذكاء الاصطناعي التعلم الآلي

Efficient network-guided multi-locus association mapping with graph cuts

115 - Chloe-Agathe Azencott , Dominik Grimm , Mahito Sugiyama 2012

As an increasing number of genome-wide association studies reveal the limitations of attempting to explain phenotypic heritability by single genetic loci, there is growing interest for associating complex phenotypes with sets of genetic loci. While s everal methods for multi-locus mapping have been proposed, it is often unclear how to relate the detected loci to the growing knowledge about gene pathways and networks. The few methods that take biological pathways or networks into account are either restricted to investigating a limited number of predetermined sets of loci, or do not scale to genome-wide settings. We present SConES, a new efficient method to discover sets of genetic loci that are maximally associated with a phenotype, while being connected in an underlying network. Our approach is based on a minimum cut reformulation of the problem of selecting features under sparsity and connectivity constraints that can be solved exactly and rapidly. SConES outperforms state-of-the-art competitors in terms of runtime, scales to hundreds of thousands of genetic loci, and exhibits higher power in detecting causal SNPs in simulation studies than existing methods. On flowering time phenotypes and genotypes from Arabidopsis thaliana, SConES detects loci that enable accurate phenotype prediction and that are supported by the literature. Matlab code for SConES is available at http://webdav.tuebingen.mpg.de/u/karsten/Forschung/scones/

التعلم الالي الأساليب الكمية

The global optimum of shallow neural network is attained by ridgelet transform

82 - Sho Sonoda , Isao Ishikawa , Masahiro Ikeda 2018

We prove that the global minimum of the backpropagation (BP) training problem of neural networks with an arbitrary nonlinear activation is given by the ridgelet transform. A series of computational experiments show that there exists an interesting si milarity between the scatter plot of hidden parameters in a shallow neural network after the BP training and the spectrum of the ridgelet transform. By introducing a continuous model of neural networks, we reduce the training problem to a convex optimization in an infinite dimensional Hilbert space, and obtain the explicit expression of the global optimizer via the ridgelet transform.

التعلم الالي التعلم الآلي

Minimum weight norm models do not always generalize well for over-parameterized problems

203 - Vatsal Shah , Anastasios Kyrillidis , Sujay Sanghavi 2018

This work is substituted by the paper in arXiv:2011.14066. Stochastic gradient descent is the de facto algorithm for training deep neural networks (DNNs). Despite its popularity, it still requires fine tuning in order to achieve its best performanc e. This has led to the development of adaptive methods, that claim automatic hyper-parameter optimization. Recently, researchers have studied both algorithmic classes via toy examples: e.g., for over-parameterized linear regression, Wilson et. al. (2017) shows that, while SGD always converges to the minimum-norm solution, adaptive methods show no such inclination, leading to worse generalization capabilities. Our aim is to study this conjecture further. We empirically show that the minimum weight norm is not necessarily the proper gauge of good generalization in simplified scenaria, and different models found by adaptive methods could outperform plain gradient methods. In practical DNN settings, we observe that adaptive methods can outperform SGD, with larger weight norm output models, but without necessarily reducing the amount of tuning required.

التعلم الالي التعلم الآلي

A Compositional Atlas of Tractable Circuit Operations: From Simple Transformations to Complex Information-Theoretic Queries

66 - Antonio Vergari , YooJung Choi , Anji Liu 2021

Circuit representations are becoming the lingua franca to express and reason about tractable generative and discriminative models. In this paper, we show how complex inference scenarios for these models that commonly arise in machine learning -- from computing the expectations of decision tree ensembles to information-theoretic divergences of deep mixture models -- can be represented in terms of tractable modular operations over circuits. Specifically, we characterize the tractability of a vocabulary of simple transformations -- sums, products, quotients, powers, logarithms, and exponentials -- in terms of sufficient structural constraints of the circuits they operate on, and present novel hardness results for the cases in which these properties are not satisfied. Building on these operations, we derive a unified framework for reasoning about tractable models that generalizes several results in the literature and opens up novel tractable inference scenarios.

التعلم الالي الذكاء الاصطناعي بنى وهياكل البيانات والخوارزميات