ترغب بنشر مسار تعليمي؟ اضغط هنا

Large-Scale Quadratically Constrained Quadratic Program via Low-Discrepancy Sequences

103   0   0.0 ( 0 )
 نشر من قبل Kinjal Basu
 تاريخ النشر 2017
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of solving a large-scale Quadratically Constrained Quadratic Program. Such problems occur naturally in many scientific and web applications. Although there are efficient methods which tackle this problem, they are mostly not scalable. In this paper, we develop a method that transforms the quadratic constraint into a linear form by sampling a set of low-discrepancy points. The transformed problem can then be solved by applying any state-of-the-art large-scale quadratic programming solvers. We show the convergence of our approximate solution to the true solution as well as some finite sample error bounds. Experimental results are also shown to prove scalability as well as improved quality of approximation in practice.



قيم البحث

اقرأ أيضاً

We prove that a special variety of quadratically constrained quadratic programs, occurring frequently in conjunction with the design of wave systems obeying causality and passivity (i.e. systems with bounded response), universally exhibit strong dual ity. Directly, the problem of continuum (grayscale or effective medium) device design for any (complex) quadratic wave objective governed by independent quadratic constraints can be solved as a convex program. The result guarantees that performance limits for many common physical objectives can be made nearly tight, and suggests far-reaching implications for problems in optics, acoustics, and quantum mechanics.
We consider the problem of communication over a channel with a causal jamming adversary subject to quadratic constraints. A sender Alice wishes to communicate a message to a receiver Bob by transmitting a real-valued length-$n$ codeword $mathbf{x}=x_ 1,...,x_n$ through a communication channel. Alice and Bob do not share common randomness. Knowing Alices encoding strategy, an adversarial jammer James chooses a real-valued length-n noise sequence $mathbf{s}=s_1,..,s_n$ in a causal manner, i.e., each $s_t (1<=t<=n)$ can only depend on $x_1,...,x_t$. Bob receives $mathbf{y}$, the sum of Alices transmission $mathbf{x}$ and James jamming vector $mathbf{s}$, and is required to reliably estimate Alices message from this sum. In addition, Alice and Jamess transmission powers are restricted by quadratic constraints $P>0$ and $N>0$. In this work, we characterize the channel capacity for such a channel as the limit superior of the optimal values of a series of optimizations. Upper and lower bounds on the optimal values are provided both analytically and numerically. Interestingly, unlike many communication problems, in this causal setting Alices optimal codebook may not have a uniform power allocation - for certain SNR, a codebook with a two-level uniform power allocation results in a strictly higher rate than a codebook with a uniform power allocation would.
We study nonconvex homogeneous quadratically constrained quadratic optimization with one or two constraints, denoted by (QQ1) and (QQ2), respectively. (QQ2) contains (QQ1), trust region subproblem (TRS) and ellipsoid regularized total least squares p roblem as special cases. It is known that there is a necessary and sufficient optimality condition for the global minimizer of (QQ2). In this paper, we first show that any local minimizer of (QQ1) is globally optimal. Unlike its special case (TRS) with at most one local non-global minimizer, (QQ2) may have infinitely many local non-global minimizers. At any local non-global minimizer of (QQ2), both linearly independent constraint qualification and strict complementary condition hold, and the Hessian of the Lagrangian has exactly one negative eigenvalue. As a main contribution, we prove that the standard second-order sufficient optimality condition for any strict local non-global minimizer of (QQ2) remains necessary. Applications and the impossibility of further extension are discussed.
365 - Da Yu , Huishuai Zhang , Wei Chen 2021
We propose a reparametrization scheme to address the challenges of applying differentially private SGD on large neural networks, which are 1) the huge memory cost of storing individual gradients, 2) the added noise suffering notorious dimensional dep endence. Specifically, we reparametrize each weight matrix with two emph{gradient-carrier} matrices of small dimension and a emph{residual weight} matrix. We argue that such reparametrization keeps the forward/backward process unchanged while enabling us to compute the projected gradient without computing the gradient itself. To learn with differential privacy, we design emph{reparametrized gradient perturbation (RGP)} that perturbs the gradients on gradient-carrier matrices and reconstructs an update for the original weight from the noisy gradients. Importantly, we use historical updates to find the gradient-carrier matrices, whose optimality is rigorously justified under linear regression and empirically verified with deep learning tasks. RGP significantly reduces the memory cost and improves the utility. For example, we are the first able to apply differential privacy on the BERT model and achieve an average accuracy of $83.9%$ on four downstream tasks with $epsilon=8$, which is within $5%$ loss compared to the non-private baseline but enjoys much lower privacy leakage risk.
The sparse inverse covariance estimation problem is commonly solved using an $ell_{1}$-regularized Gaussian maximum likelihood estimator known as graphical lasso, but its computational cost becomes prohibitive for large data sets. A recent line of re sults showed--under mild assumptions--that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $epsilon$-accurate solution in $O(nlog(1/epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا