Low Budget Active Learning via Wasserstein Distance: An Integer Programming Approach

89 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rafid Mahmood

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Rafid Mahmood - Sanja Fidler - Marc T. Law

التعلم الآلي التحسين والتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Given restrictions on the availability of data, active learning is the process of training a model with limited labeled data by selecting a core subset of an unlabeled data pool to label. Although selecting the most useful points for training is an optimization problem, the scale of deep learning data sets forces most selection strategies to employ efficient heuristics. Instead, we propose a new integer optimization problem for selecting a core set that minimizes the discrete Wasserstein distance from the unlabeled pool. We demonstrate that this problem can be tractably solved with a Generalized Benders Decomposition algorithm. Our strategy requires high-quality latent features which we obtain by unsupervised learning on the unlabeled pool. Numerical results on several data sets show that our optimization approach is competitive with baselines and particularly outperforms them in the low budget regime where less than one percent of the data set is labeled.

قيم البحث

130 - Jianshen Zhu , Naveed Ahmed Azam , Kazuya Haraguchi 2021

Recently a novel framework has been proposed for designing the molecular structure of chemical compounds using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). In the framework, we first define a feature vector $f(C )$ of a chemical graph $C$ and construct an ANN that maps $x=f(C)$ to a predicted value $eta(x)$ of a chemical property $pi$ to $C$. After this, we formulate an MILP that simulates the computation process of $f(C)$ from $C$ and that of $eta(x)$ from $x$. Given a target value $y^*$ of the chemical property $pi$, we infer a chemical graph $C^dagger$ such that $eta(f(C^dagger))=y^*$ by solving the MILP. In this paper, we use linear regression to construct a prediction function $eta$ instead of ANNs. For this, we derive an MILP formulation that simulates the computation process of a prediction function by linear regression. The results of computational experiments suggest our method can infer chemical graphs with around up to 50 non-hydrogen atoms.

التعلم الآلي التحسين والتحكم الجزيئات الحيوية

Projection Robust Wasserstein Distance and Riemannian Optimization

181 - Tianyi Lin , Chenyou Fan , Nhat Ho 2020

Projection robust Wasserstein (PRW) distance, or Wasserstein projection pursuit (WPP), is a robust variant of the Wasserstein distance. Recent work suggests that this quantity is more robust than the standard Wasserstein distance, in particular when comparing probability measures in high-dimensions. However, it is ruled out for practical application because the optimization model is essentially non-convex and non-smooth which makes the computation intractable. Our contribution in this paper is to revisit the original motivation behind WPP/PRW, but take the hard route of showing that, despite its non-convexity and lack of nonsmoothness, and even despite some hardness results proved by~citet{Niles-2019-Estimation} in a minimax sense, the original formulation for PRW/WPP textit{can} be efficiently computed in practice using Riemannian optimization, yielding in relevant cases better behavior than its convex relaxation. More specifically, we provide three simple algorithms with solid theoretical guarantee on their complexity bound (one in the appendix), and demonstrate their effectiveness and efficiency by conducing extensive experiments on synthetic and real data. This paper provides a first step into a computational theory of the PRW distance and provides the links between optimal transport and Riemannian optimization.

التعلم الآلي التحسين والتحكم التعلم الالي

Optimal qubit assignment and routing via integer programming

153 - Giacomo Nannicini , Lev S Bishop , Oktay Gunluk 2021

We consider the problem of mapping a logical quantum circuit onto a given hardware with limited two-qubit connectivity. We model this problem as an integer linear program, using a network flow formulation with binary variables that includes the initi al allocation of qubits and their routing. We consider several cost functions: an approximation of the fidelity of the circuit, its total depth, and a measure of cross-talk, all of which can be incorporated in the model. Numerical experiments on synthetic data and different hardware topologies indicate that the error rate and depth can be optimized simultaneously without significant loss. We test our algorithm on a large number of quantum volume circuits, optimizing for error rate and depth; our algorithm significantly reduces the number of CNOTs compared to Qiskits default transpiler SABRE, and produces circuits that, when executed on hardware, exhibit higher fidelity.

فيزياء الكم التحسين والتحكم

AWCD: An Efficient Point Cloud Processing Approach via Wasserstein Curvature

177 - Yihao Luo , Ailing Yang , Fupeng Sun 2021

In this paper, we introduce the adaptive Wasserstein curvature denoising (AWCD), an original processing approach for point cloud data. By collecting curvatures information from Wasserstein distance, AWCD consider more precise structures of data and p reserves stability and effectiveness even for data with noise in high density. This paper contains some theoretical analysis about the Wasserstein curvature and the complete algorithm of AWCD. In addition, we design digital experiments to show the denoising effect of AWCD. According to comparison results, we present the advantages of AWCD against traditional algorithms.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Learning Pseudo-Backdoors for Mixed Integer Programs

231 - Aaron Ferber , Jialin Song , Bistra Dilkina 2021

We propose a machine learning approach for quickly solving Mixed Integer Programs (MIP) by learning to prioritize a set of decision variables, which we call pseudo-backdoors, for branching that results in faster solution times. Learning-based approac hes have seen success in the area of solving combinatorial optimization problems by being able to flexibly leverage common structures in a given distribution of problems. Our approach takes inspiration from the concept of strong backdoors, which corresponds to a small set of variables such that only branching on these variables yields an optimal integral solution and a proof of optimality. Our notion of pseudo-backdoors corresponds to a small set of variables such that only branching on them leads to faster solve time (which can be solver dependent). A key advantage of pseudo-backdoors over strong backdoors is that they are much amenable to data-driven identification or prediction. Our proposed method learns to estimate the solver performance of a proposed pseudo-backdoor, using a labeled dataset collected on a set of training MIP instances. This model can then be used to identify high-quality pseudo-backdoors on new MIP instances from the same distribution. We evaluate our method on the generalized independent set problems and find that our approach can efficiently identify high-quality pseudo-backdoors. In addition, we compare our learned approach against Gurobi, a state-of-the-art MIP solver, demonstrating that our method can be used to improve solver performance.

التعلم الآلي التحسين والتحكم