Conditional Gradient Methods for Convex Optimization with General Affine and Nonlinear Constraints

76 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Guanghui Lan

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Guanghui Lan - Edwin Romeijn - Zhiqiang Zhou

التحسين والتحكم التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Conditional gradient methods have attracted much attention in both machine learning and optimization communities recently. These simple methods can guarantee the generation of sparse solutions. In addition, without the computation of full gradients, they can handle huge-scale problems sometimes even with an exponentially increasing number of decision variables. This paper aims to significantly expand the application areas of these methods by presenting new conditional gradient methods for solving convex optimization problems with general affine and nonlinear constraints. More specifically, we first present a new constraint extrapolated condition gradient (CoexCG) method that can achieve an ${cal O}(1/epsilon^2)$ iteration complexity for both smooth and structured nonsmooth function constrained convex optimization. We further develop novel variants of CoexCG, namely constraint extrapolated and dual regularized conditional gradient (CoexDurCG) methods, that can achieve similar iteration complexity to CoexCG but allow adaptive selection for algorithmic parameters. We illustrate the effectiveness of these methods for solving an important class of radiation therapy treatment planning problems arising from healthcare industry. To the best of our knowledge, all the algorithmic schemes and their complexity results are new in the area of projection-free methods.

قيم البحث

230 - Runchao Ma , Qihang Lin , Tianbao Yang 2019

Optimization models with non-convex constraints arise in many tasks in machine learning, e.g., learning with fairness constraints or Neyman-Pearson classification with non-convex loss. Although many efficient methods have been developed with theoreti cal convergence guarantees for non-convex unconstrained problems, it remains a challenge to design provably efficient algorithms for problems with non-convex functional constraints. This paper proposes a class of subgradient methods for constrained optimization where the objective function and the constraint functions are are weakly convex. Our methods solve a sequence of strongly convex subproblems, where a proximal term is added to both the objective function and each constraint function. Each subproblem can be solved by various algorithms for strongly convex optimization. Under a uniform Slaters condition, we establish the computation complexities of our methods for finding a nearly stationary point.

التحسين والتحكم

Biased Stochastic Gradient Descent for Conditional Stochastic Optimization

148 - Yifan Hu , Siqi Zhang , Xin Chen 2020

Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure . As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives, under smooth and non-smooth conditions. We also provide matching lower bounds of BSGD for convex CSO objectives. Extensive numerical experiments are conducted to illustrate the performance of BSGD on robust logistic regression, model-agnostic meta-learning (MAML), and instrumental variable regression (IV).

التحسين والتحكم التعلم الآلي التعلم الالي

Fast Proximal Gradient Methods for Nonsmooth Convex Optimization for Tomographic Image Reconstruction

98 - Elias S. Helou , Marcelo V. W. Zibetti , Gabor T. Herman 2020

The Fast Proximal Gradient Method (FPGM) and the Monotone FPGM (MFPGM) for minimization of nonsmooth convex functions are introduced and applied to tomographic image reconstruction. Convergence properties of the sequence of objective function values are derived, including a $Oleft(1/k^{2}right)$ non-asymptotic bound. The presented theory broadens current knowledge and explains the convergence behavior of certain methods that are known to present good practical performance. Numerical experimentation involving computerized tomography image reconstruction shows the methods to be competitive in practical scenarios. Experimental comparison with Algebraic Reconstruction Techniques are performed uncovering certain behaviors of accelerated Proximal Gradient algorithms that apparently have not yet been noticed when these are applied to tomographic image reconstruction.

التحسين والتحكم التحليل العددي التحليل العددي

Adaptive Gradient Descent for Convex and Non-Convex Stochastic Optimization

132 - Darina Dvinskikh , Aleksandr Ogaltsov , Alexander Gasnikov 2019

In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradien t and variance of the stochastic approximation for the gradient. We consider an accelerated and non-accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.

التحسين والتحكم

Conditional gradient methods for stochastically constrained convex minimization

132 - Maria-Luiza Vladarean , Ahmet Alacaoglu , Ya-Ping Hsieh 2020

We propose two novel conditional gradient-based methods for solving structured stochastic convex optimization problems with a large number of linear constraints. Instances of this template naturally arise from SDP-relaxations of combinatorial problem s, which involve a number of constraints that is polynomial in the problem dimension. The most important feature of our framework is that only a subset of the constraints is processed at each iteration, thus gaining a computational advantage over prior works that require full passes. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees. Preliminary numerical experiments are provided for illustrating the practical performance of the methods.

التعلم الآلي التحسين والتحكم التعلم الالي