Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization

87 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sijia Liu

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Sijia Liu - Bhavya Kailkhura - Pin-Yu Chen

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

As application demands for zeroth-order (gradient-free) optimization accelerate, the need for variance reduced and faster converging approaches is also intensifying. This paper addresses these challenges by presenting: a) a comprehensive theoretical analysis of variance reduced zeroth-order (ZO) optimization, b) a novel variance reduced ZO algorithm, called ZO-SVRG, and c) an experimental evaluation of our approach in the context of two compelling applications, black-box chemical material classification and generation of adversarial examples from black-box deep neural network models. Our theoretical analysis uncovers an essential difficulty in the analysis of ZO-SVRG: the unbiased assumption on gradient estimates no longer holds. We prove that compared to its first-order counterpart, ZO-SVRG with a two-point random gradient estimator could suffer an additional error of order $O(1/b)$, where $b$ is the mini-batch size. To mitigate this error, we propose two accelerate

قيم البحث

141 - Xinlei Yi , Shengjun Zhang , Tao Yang 2021

In this paper, we consider a stochastic distributed nonconvex optimization problem with the cost function being distributed over $n$ agents having access only to zeroth-order (ZO) information of the cost. This problem has various machine learning app lications. As a solution, we propose two distributed ZO algorithms, in which at each iteration each agent samples the local stochastic ZO oracle at two points with an adaptive smoothing parameter. We show that the proposed algorithms achieve the linear speedup convergence rate $mathcal{O}(sqrt{p/(nT)})$ for smooth cost functions and $mathcal{O}(p/(nT))$ convergence rate when the global cost function additionally satisfies the Polyak--Lojasiewicz (P--L) condition, where $p$ and $T$ are the dimension of the decision variable and the total number of iterations, respectively. To the best of our knowledge, this is the first linear speedup result for distributed ZO algorithms, which enables systematic processing performance improvements by adding more agents. We also show that the proposed algorithms converge linearly when considering deterministic centralized optimization problems under the P--L condition. We demonstrate through numerical experiments the efficiency of our algorithms on generating adversarial examples from deep neural networks in comparison with baseline and recently proposed centralized and distributed ZO algorithms.

التحسين والتحكم

Accelerated Zeroth-order Algorithm for Stochastic Distributed Nonconvex Optimization

163 - Shengjun Zhang , Colleen P. Bailey 2021

This paper investigates how to accelerate the convergence of distributed optimization algorithms on nonconvex problems with zeroth-order information available only. We propose a zeroth-order (ZO) distributed primal-dual stochastic coordinates algorit hm equipped with powerball method to accelerate. We prove that the proposed algorithm has a convergence rate of $mathcal{O}(sqrt{p}/sqrt{nT})$ for general nonconvex cost functions. We consider solving the generation of adversarial examples from black-box DNNs problem to compare with the existing state-of-the-art centralized and distributed ZO algorithms. The numerical results demonstrate the faster convergence rate of the proposed algorithm and match the theoretical analysis.

التحسين والتحكم

Stabilized SVRG: Simple Variance Reduction for Nonconvex Optimization

141 - Rong Ge , Zhize Li , Weiyao Wang 2019

Variance reduction techniques like SVRG provide simple and fast algorithms for optimizing a convex finite-sum objective. For nonconvex objectives, these techniques can also find a first-order stationary point (with small gradient). However, in noncon vex optimization it is often crucial to find a second-order stationary point (with small gradient and almost PSD hessian). In this paper, we show that Stabilized SVRG (a simple variant of SVRG) can find an $epsilon$-second-order stationary point using only $widetilde{O}(n^{2/3}/epsilon^2+n/epsilon^{1.5})$ stochastic gradients. To our best knowledge, this is the first second-order guarantee for a simple variant of SVRG. The running time almost matches the known guarantees for finding $epsilon$-first-order stationary points.

التعلم الآلي التحسين والتحكم التعلم الالي

Stochastic Variance Reduction for Deep Q-learning

406 - Wei-Ye Zhao , Xi-Ya Guan , Yang Liu 2019

Recent advances in deep reinforcement learning have achieved human-level performance on a variety of real-world applications. However, the current algorithms still suffer from poor gradient estimation with excessive variance, resulting in unstable tr aining and poor sample efficiency. In our paper, we proposed an innovative optimization strategy by utilizing stochastic variance reduced gradient (SVRG) techniques. With extensive experiments on Atari domain, our method outperforms the deep q-learning baselines on 18 out of 20 games.

التعلم الآلي التعلم الالي

Zeroth Order Non-convex optimization with Dueling-Choice Bandits

93 - Yichong Xu , Aparna Joshi , Aarti Singh 2019

We consider a novel setting of zeroth order non-convex optimization, where in addition to querying the function value at a given point, we can also duel two points and get the point with the larger function value. We refer to this setting as optimiza tion with dueling-choice bandits since both direct queries and duels are available for optimization. We give the COMP-GP-UCB algorithm based on GP-UCB (Srinivas et al., 2009), where instead of directly querying the point with the maximum Upper Confidence Bound (UCB), we perform a constrained optimization and use comparisons to filter out suboptimal points. COMP-GP-UCB comes with theoretical guarantee of $O(frac{Phi}{sqrt{T}})$ on simple regret where $T$ is the number of direct queries and $Phi$ is an improved information gain corresponding to a comparison based constraint set that restricts the search space for the optimum. In contrast, in the direct query only setting, $Phi$ depends on the entire domain. Finally, we present experimental results to show the efficacy of our algorithm.

التعلم الآلي التعلم الالي