LEAD: Least-Action Dynamics for Min-Max Optimization

116 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Reyhane Askari Hemmat

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Reyhane Askari Hemmat - Amartya Mitra - Guillaume Lajoie

التعلم الآلي علوم الكمبيوتر ونظرية الألعاب أنظمة متعددة العملاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Adversarial formulations such as generative adversarial networks (GANs) have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the rotational dynamics that hinder their convergence. Existing methods typically employ intuitive, carefully hand-designed mechanisms for controlling such rotations. In this paper, we take a novel approach to address this issue by casting min-max optimization as a physical system. We leverage tools from physics to introduce LEAD (Least-Action Dynamics), a second-order optimizer for min-max games. Next, using Lyapunov stability theory and spectral analysis, we study LEADs convergence properties in continuous and discrete-time settings for bilinear games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements over baseline methods.

قيم البحث

132 - Lampros Flokas , Emmanouil-Vasileios Vlatakis-Gkaragkounis , Georgiosn Piliouras 2021

Many recent AI architectures are inspired by zero-sum games, however, the behavior of their dynamics is still not well understood. Inspired by this, we study standard gradient descent ascent (GDA) dynamics in a specific class of non-convex non-concav e zero-sum games, that we call hidden zero-sum games. In this class, players control the inputs of smooth but possibly non-linear functions whose outputs are being applied as inputs to a convex-concave game. Unlike general zero-sum games, these games have a well-defined notion of solution; outcomes that implement the von-Neumann equilibrium of the hidden convex-concave game. We prove that if the hidden game is strictly convex-concave then vanilla GDA converges not merely to local Nash, but typically to the von-Neumann solution. If the game lacks strict convexity properties, GDA may fail to converge to any equilibrium, however, by applying standard regularization techniques we can prove convergence to a von-Neumann solution of a slightly perturbed zero-sum game. Our convergence guarantees are non-local, which as far as we know is a first-of-its-kind type of result in non-convex non-concave games. Finally, we discuss connections of our framework with generative adversarial networks.

التحسين والتحكم علوم الكمبيوتر ونظرية الألعاب التعلم الآلي

Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization

124 - Yaodong Yu , Tianyi Lin , Eric Mazumdar 2021

Distributionally robust supervised learning (DRSL) is emerging as a key paradigm for building reliable machine learning systems for real-world applications -- reflecting the need for classifiers and predictive models that are robust to the distributi on shifts that arise from phenomena such as selection bias or nonstationarity. Existing algorithms for solving Wasserstein DRSL -- one of the most popular DRSL frameworks based around robustness to perturbations in the Wasserstein distance -- involve solving complex subproblems or fail to make use of stochastic gradients, limiting their use in large-scale machine learning problems. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable stochastic extra-gradient algorithms which provably achieve faster convergence rates than existing approaches. We demonstrate their effectiveness on synthetic and real data when compared to existing DRSL approaches. Key to our results is the use of variance reduction and random reshuffling to accelerate stochastic min-max optimization, the analysis of which may be of independent interest.

التعلم الآلي التحسين والتحكم التعلم الالي

The Min-Max Complexity of Distributed Stochastic Convex Optimization with Intermittent Communication

249 - Blake Woodworth , Brian Bullins , Ohad Shamir 2021

We resolve the min-max complexity of distributed stochastic convex optimization (up to a log factor) in the intermittent communication setting, where $M$ machines work in parallel over the course of $R$ rounds of communication to optimize the objecti ve, and during each round of communication, each machine may sequentially compute $K$ stochastic gradient estimates. We present a novel lower bound with a matching upper bound that establishes an optimal algorithm.

التعلم الآلي التحسين والتحكم

Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max Optimization

74 - Haochuan Li , Yi Tian , Jingzhao Zhang 2021

We provide a first-order oracle complexity lower bound for finding stationary points of min-max optimization problems where the objective function is smooth, nonconvex in the minimization variable, and strongly concave in the maximization variable. W e establish a lower bound of $Omegaleft(sqrt{kappa}epsilon^{-2}right)$ for deterministic oracles, where $epsilon$ defines the level of approximate stationarity and $kappa$ is the condition number. Our analysis shows that the upper bound achieved in (Lin et al., 2020b) is optimal in the $epsilon$ and $kappa$ dependence up to logarithmic factors. For stochastic oracles, we provide a lower bound of $Omegaleft(sqrt{kappa}epsilon^{-2} + kappa^{1/3}epsilon^{-4}right)$. It suggests that there is a significant gap between the upper bound $mathcal{O}(kappa^3 epsilon^{-4})$ in (Lin et al., 2020a) and our lower bound in the condition number dependence.

التحسين والتحكم التعلم الآلي التعلم الالي

Solving Non-Convex Non-Differentiable Min-Max Games using Proximal Gradient Method

626 - Babak Barazandeh , Meisam Razaviyayn 2020

Min-max saddle point games appear in a wide range of applications in machine leaning and signal processing. Despite their wide applicability, theoretical studies are mostly limited to the special convex-concave structure. While some recent works gene ralized these results to special smooth non-convex cases, our understanding of non-smooth scenarios is still limited. In this work, we study special form of non-smooth min-max games when the objective function is (strongly) convex with respect to one of the players decision variable. We show that a simple multi-step proximal gradient descent-ascent algorithm converges to $epsilon$-first-order Nash equilibrium of the min-max game with the number of gradient evaluations being polynomial in $1/epsilon$. We will also show that our notion of stationarity is stronger than existing ones in the literature. Finally, we evaluate the performance of the proposed algorithm through adversarial attack on a LASSO estimator.

التحسين والتحكم علوم الكمبيوتر ونظرية الألعاب التعلم الآلي