Gumbel-softmax-based Optimization: A Simple General Framework for Optimization Problems on Graphs

86 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Jing Liu

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Yaoxin Li - Jing Liu - Guozheng Lin

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In computer science, there exist a large number of optimization problems defined on graphs, that is to find a best node state configuration or a network structure such that the designed objective function is optimized under some constraints. However, these problems are notorious for their hardness to solve because most of them are NP-hard or NP-complete. Although traditional general methods such as simulated annealing (SA), genetic algorithms (GA) and so forth have been devised to these hard problems, their accuracy and time consumption are not satisfying in practice. In this work, we proposed a simple, fast, and general algorithm framework based on advanced automatic differentiation technique empowered by deep learning frameworks. By introducing Gumbel-softmax technique, we can optimize the objective function directly by gradient descent algorithm regardless of the discrete nature of variables. We also introduce evolution strategy to parallel version of our algorithm. We test our algorithm on three representative optimization problems on graph including modularity optimization from network science, Sherrington-Kirkpatrick (SK) model from statistical physics, maximum independent set (MIS) and minimum vertex cover (MVC) problem from combinatorial optimization on graph. High-quality solutions can be obtained with much less time consuming compared to traditional approaches.

قيم البحث

86 - Jialin Song , Yuxin Chen , Yisong Yue 2018

How can we efficiently gather information to optimize an unknown function, when presented with multiple, mutually dependent information sources with different costs? For example, when optimizing a robotic system, intelligently trading off computer si mulations and real robot testings can lead to significant savings. Existing methods, such as multi-fidelity GP-UCB or Entropy Search-based approaches, either make simplistic assumptions on the interaction among different fidelities or use simple heuristics that lack theoretical guarantees. In this paper, we study multi-fidelity Bayesian optimization with complex structural dependencies among multiple outputs, and propose MF-MI-Greedy, a principled algorithmic framework for addressing this problem. In particular, we model different fidelities using additive Gaussian processes based on shared latent structures with the target function. Then we use cost-sensitive mutual information gain for efficient Bayesian global optimization. We propose a simple notion of regret which incorporates the cost of different fidelities, and prove that MF-MI-Greedy achieves low regret. We demonstrate the strong empirical performance of our algorithm on both synthetic and real-world datasets.

التعلم الآلي التعلم الالي

Generalized Gumbel-Softmax Gradient Estimator for Various Discrete Random Variables

135 - Weonyoung Joo , Dongjun Kim , Seungjae Shin 2020

Estimating the gradients of stochastic nodes is one of the crucial research questions in the deep generative modeling community, which enables the gradient descent optimization on neural network parameters. This estimation problem becomes further com plex when we regard the stochastic nodes to be discrete because pathwise derivative techniques cannot be applied. Hence, the stochastic gradient estimation of discrete distributions requires either a score function method or continuous relaxation of the discrete random variables. This paper proposes a general version of the Gumbel-Softmax estimator with continuous relaxation, and this estimator is able to relax the discreteness of probability distributions including more diverse types, other than categorical and Bernoulli. In detail, we utilize the truncation of discrete random variables and the Gumbel-Softmax trick with a linear transformation for the relaxed reparameterization. The proposed approach enables the relaxed discrete random variable to be reparameterized and to backpropagated through a large scale stochastic computational graph. Our experiments consist of (1) synthetic data analyses, which show the efficacy of our methods; and (2) applications on VAE and topic model, which demonstrate the value of the proposed estimation in practices.

التعلم الآلي التعلم الالي

A Memristor-Based Optimization Framework for AI Applications

69 - Sijia Liu , Yanzhi Wang , Makan Fardad 2017

Memristors have recently received significant attention as ubiquitous device-level components for building a novel generation of computing systems. These devices have many promising features, such as non-volatility, low power consumption, high densit y, and excellent scalability. The ability to control and modify biasing voltages at the two terminals of memristors make them promising candidates to perform matrix-vector multiplications and solve systems of linear equations. In this article, we discuss how networks of memristors arranged in crossbar arrays can be used for efficiently solving optimization and machine learning problems. We introduce a new memristor-based optimization framework that combines the computational merit of memristor crossbars with the advantages of an operator splitting method, alternating direction method of multipliers (ADMM). Here, ADMM helps in splitting a complex optimization problem into subproblems that involve the solution of systems of linear equations. The capability of this framework is shown by applying it to linear programming, quadratic programming, and sparse optimization. In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed. The memristor-based PI method can further be applied to principal component analysis (PCA). The use of memristor crossbars yields a significant speed-up in computation, and thus, we believe, has the potential to advance optimization and machine learning research in artificial intelligence (AI).

التقنيات الناشئة التعلم الالي

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

92 - Runzhong Wang , Zhigang Hua , Gan Liu 2021

Combinatorial Optimization (CO) has been a long-standing challenging research topic featured by its NP-hard nature. Traditionally such problems are approximately solved with heuristic algorithms which are usually fast but may sacrifice the solution q uality. Currently, machine learning for combinatorial optimization (MLCO) has become a trending research topic, but most existing MLCO methods treat CO as a single-level optimization by directly learning the end-to-end solutions, which are hard to scale up and mostly limited by the capacity of ML models given the high complexity of CO. In this paper, we propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph (e.g. add, delete or modify edges in a graph), fused with a lower-level heuristic algorithm solving on the optimized graph. Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity. The experiments and results on several popular CO problems like Directed Acyclic Graph scheduling, Graph Edit Distance and Hamiltonian Cycle Problem show its effectiveness over manually designed heuristics and single-level learning methods.

التعلم الآلي التوافقية

A Dynamic Programming Framework for Combinatorial Optimization Problems on Graphs with Bounded Pathwidth

139 - Mugurel Ionut Andreica 2012

In this paper we present an algorithmic framework for solving a class of combinatorial optimization problems on graphs with bounded pathwidth. The problems are NP-hard in general, but solvable in linear time on this type of graphs. The problems are r elevant for assessing network reliability and improving the networks performance and fault tolerance. The main technique considered in this paper is dynamic programming.

بنى وهياكل البيانات والخوارزميات الرياضيات المتقطعة