Solving Mixed Integer Programs Using Neural Networks

136 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Nicolas Sonnerat

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Vinod Nair - Sergey Bartunov - Felix Gimeno

التحسين والتحكم الذكاء الاصطناعي الرياضيات المتقطعة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Mixed Integer Programming (MIP) solvers rely on an array of sophisticated heuristics developed with decades of research to solve large-scale MIP instances encountered in practice. Machine learning offers to automatically construct better heuristics from data by exploiting shared structure among instances in the data. This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. Neural Diving learns a deep neural network to generate multiple partial assignments for its integer variables, and the resulting smaller MIPs for un-assigned variables are solved with SCIP to construct high quality joint assignments. Neural Branching learns a deep neural network to make variable selection decisions in branch-and-bound to bound the objective value gap with a small tree. This is done by imitating a new variant of Full Strong Branching we propose that scales to large instances using GPUs. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each. Most instances in all the datasets combined have $10^3-10^6$ variables and constraints after presolve, which is significantly larger than previous learning approaches. Comparing solvers with respect to primal-dual gap averaged over a held-out set of instances, the learning-augmented SCIP is 2x to 10x better on all datasets except one on which it is $10^5$x better, at large time limits. To the best of our knowledge, ours is the first learning approach to demonstrate such large improvements over SCIP on both large-scale real-world application datasets and MIPLIB.

قيم البحث

167 - Nicolas Sonnerat , Pengming Wang , Ira Ktena 2021

Large Neighborhood Search (LNS) is a combinatorial optimization heuristic that starts with an assignment of values for the variables to be optimized, and iteratively improves it by searching a large neighborhood around the current assignment. In this paper we consider a learning-based LNS approach for mixed integer programs (MIPs). We train a Neural Diving model to represent a probability distribution over assignments, which, together with an off-the-shelf MIP solver, generates an initial assignment. Formulating the subsequent search steps as a Markov Decision Process, we train a Neural Neighborhood Selection policy to select a search neighborhood at each step, which is searched using a MIP solver to find the next assignment. The policy network is trained using imitation learning. We propose a target policy for imitation that, given enough compute resources, is guaranteed to select the neighborhood containing the optimal next assignment amongst all possible choices for the neighborhood of a specified size. Our approach matches or outperforms all the baselines on five real-world MIP datasets with large-scale instances from diverse applications, including two production applications at Google. It achieves $2times$ to $37.8times$ better average primal gap than the best baseline on three of the datasets at large running times.

التحسين والتحكم التعلم الآلي

A General Large Neighborhood Search Framework for Solving Integer Linear Programs

170 - Jialin Song , Ravi Lanka , Yisong Yue 2020

This paper studies a strategy for data-driven algorithm design for large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic or complete approaches and their software implementations. We show that one can learn a good neighborhood selector using imitation and reinforcement learning techniques. Through an extensive empirical validation in bounded-time optimization, we demonstrate that our LNS framework can significantly outperform compared to state-of-the-art commercial solvers such as Gurobi.

التحسين والتحكم التعلم الآلي التعلم الالي

Integer Minkowski Programs and the Design of Survivable Networks

148 - Elke Eisenschmidt Otto-von-Guericke-Universitat Magdeburg 2006

We introduce a new class of optimization problems called integer Minkowski programs. The formulation of such problems involves finitely many integer variables and nonlinear constraints involving functionals defined on families of discrete or polyhedr al sets. We show that, under certain assumptions, it is possible to reformulate them as integer linear programs, by making use of integral generating sets. We then apply this technique to the network design problem for fractional and integral flows subject to survivability constraints.

التحسين والتحكم

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

174 - Dipankar Das , Naveen Mellempudi , Dheevatsa Mudigere 2018

The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has al so happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only AlexNet for ImageNet-1K), or relatively small datasets (like CIFAR-10). In this work, we train state-of-the-art visual understanding neural networks on the ImageNet-1K dataset, with Integer operations on General Purpose (GP) hardware. In particular, we focus on Integer Fused-Multiply-and-Accumulate (FMA) operations which take two pairs of INT16 operands and accumulate results into an INT32 output.We propose a shared exponent representation of tensors and develop a Dynamic Fixed Point (DFP) scheme suitable for common neural network operations. The nuances of developing an efficient integer convolution kernel is examined, including methods to handle overflow of the INT32 accumulator. We implement CNN training for ResNet-50, GoogLeNet-v1, VGG-16 and AlexNet; and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. To the best of our knowledge these results represent the first INT16 training results on GP hardware for ImageNet-1K dataset using SOTA CNNs and achieve highest reported accuracy using half-precision

الحوسبة العصبية والتطورية التعلم الآلي التحليل العددي

Designing Networks: A Mixed-Integer Linear Optimization Approach

365 - Chrysanthos E. Gounaris , Karthikeyan Rajendran , Ioannis G. Kevrekidis 2015

Designing networks with specified collective properties is useful in a variety of application areas, enabling the study of how given properties affect the behavior of network models, the downscaling of empirical networks to workable sizes, and the an alysis of network evolution. Despite the importance of the task, there currently exists a gap in our ability to systematically generate networks that adhere to theoretical guarantees for the given property specifications. In this paper, we propose the use of Mixed-Integer Linear Optimization modeling and solution methodologies to address this Network Generation Problem. We present a number of useful modeling techniques and apply them to mathematically express and constrain network properties in the context of an optimization formulation. We then develop complete formulations for the generation of networks that attain specified levels of connectivity, spread, assortativity and robustness, and we illustrate these via a number of computational case studies.

التحسين والتحكم الشبكات الاجتماعية والمعلومات الفيزياء والمجتمع