Outlier-robust moment-estimation via sum-of-squares

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل David Steurer

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Pravesh K. Kothari - David Steurer

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We develop efficient algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers. The guarantees of our algorithms improve in many cases significantly over the best previous ones, obtained in recent works of Diakonikolas et al, Lai et al, and Charikar et al. We also show that the guarantees of our algorithms match information-theoretic lower-bounds for the class of distributions we consider. These improved guarantees allow us to give improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers. Our algorithms are based on a standard sum-of-squares relaxation of the following conceptually-simple optimization problem: Among all distributions whose moments are bounded in the same way as for the unknown distribution, find the one that is closest in statistical distance to the empirical distribution of the adversarially-corrupted sample.

قيم البحث

87 - Prasad Raghavendra , Tselil Schramm , David Steurer 2018

Estimation is the computational task of recovering a hidden parameter $x$ associated with a distribution $D_x$, given a measurement $y$ sampled from the distribution. High dimensional estimation problems arise naturally in statistics, machine learnin g, and complexity theory. Many high dimensional estimation problems can be formulated as systems of polynomial equations and inequalities, and thus give rise to natural probability distributions over polynomial systems. Sum-of-squares proofs provide a powerful framework to reason about polynomial systems, and further there exist efficient algorithms to search for low-degree sum-of-squares proofs. Understanding and characterizing the power of sum-of-squares proofs for estimation problems has been a subject of intense study in recent years. On one hand, there is a growing body of work utilizing sum-of-squares proofs for recovering solutions to polynomial systems when the system is feasible. On the other hand, a general technique referred to as pseudocalibration has been developed towards showing lower bounds on the degree of sum-of-squares proofs. Finally, the existence of sum-of-squares refutations of a polynomial system has been shown to be intimately connected to the existence of spectral algorithms. In this article we survey these developments.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي التعلم الآلي

Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

398 - Boaz Barak , Jonathan A. Kelner , David Steurer 2014

We give a new approach to the dictionary learning (also known as sparse coding) problem of recovering an unknown $ntimes m$ matrix $A$ (for $m geq n$) from examples of the form [ y = Ax + e, ] where $x$ is a random vector in $mathbb R^m$ with at most $tau m$ nonzero coordinates, and $e$ is a random noise vector in $mathbb R^n$ with bounded magnitude. For the case $m=O(n)$, our algorithm recovers every column of $A$ within arbitrarily good constant accuracy in time $m^{O(log m/log(tau^{-1}))}$, in particular achieving polynomial time if $tau = m^{-delta}$ for any $delta>0$, and time $m^{O(log m)}$ if $tau$ is (a sufficiently small) constant. Prior algorithms with comparable assumptions on the distribution required the vector $x$ to be much sparser---at most $sqrt{n}$ nonzero coordinates---and there were intrinsic barriers preventing these algorithms from applying for denser $x$. We achieve this by designing an algorithm for noisy tensor decomposition that can recover, under quite general conditions, an approximate rank-one decomposition of a tensor $T$, given access to a tensor $T$ that is $tau$-close to $T$ in the spectral norm (when considered as a matrix). To our knowledge, this is the first algorithm for tensor decomposition that works in the constant spectral-norm noise regime, where there is no guarantee that the local optima of $T$ and $T$ have similar structures. Our algorithm is based on a novel approach to using and analyzing the Sum of Squares semidefinite programming hierarchy (Parrilo 2000, Lasserre 2001), and it can be viewed as an indication of the utility of this very general and powerful tool for unsupervised learning problems.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

Rounding Sum-of-Squares Relaxations

390 - Boaz Barak , Jonathan Kelner , David Steurer 2013

We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares method (Lasserre hierarchy). Our approach is based on using the connection between these relaxations and the Sum-of-Squares proof system to transform a *combining algorithm* -- an algorithm that maps a distribution over solutions into a (possibly weaker) solution -- into a *rounding algorithm* that maps a solution of the relaxation to a solution of the original problem. Using this approach, we obtain algorithms that yield improved results for natural variants of three well-known problems: 1) We give a quasipolynomial-time algorithm that approximates the maximum of a low degree multivariate polynomial with non-negative coefficients over the Euclidean unit sphere. Beyond being of interest in its own right, this is related to an open question in quantum information theory, and our techniques have already led to improved results in this area (Brand~{a}o and Harrow, STOC 13). 2) We give a polynomial-time algorithm that, given a d dimensional subspace of R^n that (almost) contains the characteristic function of a set of size n/k, finds a vector $v$ in the subspace satisfying $|v|_4^4 > c(k/d^{1/3}) |v|_2^2$, where $|v|_p = (E_i v_i^p)^{1/p}$. Aside from being a natural relaxation, this is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012) and our results yield a certain improvement for that problem. 3) We use this notion of L_4 vs. L_2 sparsity to obtain a polynomial-time algorithm with substantially improved guarantees for recovering a planted $mu$-sparse vector v in a random d-dimensional subspace of R^n. If v has mu n nonzero coordinates, we can recover it with high probability whenever $mu < O(min(1,n/d^2))$, improving for $d < n^{2/3}$ prior methods which intrinsically required $mu < O(1/sqrt(d))$.

بنى وهياكل البيانات والخوارزميات التعلم الآلي فيزياء الكم

Provably Safe Tolerance Estimation for Robot Arms via Sum-of-Squares Programming

224 - Weiye Zhao , Suqin He , 2021

Tolerance estimation problems are prevailing in engineering applications. For example, in modern robotics, it remains challenging to efficiently estimate joint tolerance, ie the maximal allowable deviation from a reference robot state such that safet y constraints are still satisfied. This paper presented an efficient algorithm to estimate the joint tolerance using sum-of-squares programming. It is theoretically proved that the algorithm provides a tight lower bound of the joint tolerance. Extensive numerical studies demonstrate that the proposed method is computationally efficient and near optimal. The algorithm is implemented in the JTE toolbox and is available at url{https://github.com/intelligent-control-lab/Sum-of-Square-Safety-Optimization}.

علم الروبوتات التعلم الآلي

Sublinear Least-Squares Value Iteration via Locality Sensitive Hashing

114 - Anshumali Shrivastava , Zhao Song , Zhaozhuo Xu 2021

We present the first provable Least-Squares Value Iteration (LSVI) algorithms that have runtime complexity sublinear in the number of actions. We formulate the value function estimation procedure in value iteration as an approximate maximum inner pro duct search problem and propose a locality sensitive hashing (LSH) [Indyk and Motwani STOC98, Andoni and Razenshteyn STOC15, Andoni, Laarhoven, Razenshteyn and Waingarten SODA17] type data structure to solve this problem with sublinear time complexity. Moreover, we build the connections between the theory of approximate maximum inner product search and the regret analysis of reinforcement learning. We prove that, with our choice of approximation factor, our Sublinear LSVI algorithms maintain the same regret as the original LSVI algorithms while reducing the runtime complexity to sublinear in the number of actions. To the best of our knowledge, this is the first work that combines LSH with reinforcement learning resulting in provable improvements. We hope that our novel way of combining data-structures and iterative algorithm will open the door for further study into cost reduction in optimization.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي