ﻻ يوجد ملخص باللغة العربية
In list-decodable subspace recovery, the input is a collection of $n$ points $alpha n$ (for some $alpha ll 1/2$) of which are drawn i.i.d. from a distribution $mathcal{D}$ with a isotropic rank $r$ covariance $Pi_*$ (the emph{inliers}) and the rest are arbitrary, potential adversarial outliers. The goal is to recover a $O(1/alpha)$ size list of candidate covariances that contains a $hat{Pi}$ close to $Pi_*$. Two recent independent works (Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for this problem. These results, however, obtain an error that grows with the dimension (linearly in [RY] and logarithmically in BK) at the cost of quasi-polynomial running time) and rely on emph{certifiable anti-concentration} - a relatively strict condition satisfied essentially only by the Gaussian distribution. In this work, we improve on these results on all three fronts: emph{dimension-independent} error via a faster fixed-polynomial running time under less restrictive distributional assumptions. Specifically, we give a $poly(1/alpha) d^{O(1)}$ time algorithm that outputs a list containing a $hat{Pi}$ satisfying $|hat{Pi} -Pi_*|_F leq O(1/alpha)$. Our result only needs $mathcal{D}$ to have emph{certifiably hypercontractive} degree 2 polynomials. As a result, in addition to Gaussians, our algorithm applies to the uniform distribution on the hypercube and $q$-ary cubes and arbitrary product distributions with subgaussian marginals. Prior work (Raghavendra and Yau, 2020) had identified such distributions as potential hard examples as such distributions do not exhibit strong enough anti-concentration. When $mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger error guarantee of $|hat{Pi}-Pi_*|_F leq eta$ for any arbitrary $eta > 0$ in $d^{O(poly(1/alpha) + log (1/eta))}$ time.
We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $alpha < 1$, our algorithm takes as input a sample ${(x_i,y_i)}_{i leq
We give a polynomial time algorithm for the lossy population recovery problem. In this problem, the goal is to approximately learn an unknown distribution on binary strings of length $n$ from lossy samples: for some parameter $mu$ each coordinate of
Kernel methods are fundamental in machine learning, and faster algorithms for kernel approximation provide direct speedups for many core tasks in machine learning. The polynomial kernel is especially important as other kernels can often be approximat
The list-decodable code has been an active topic in theoretical computer science since the seminal papers of M. Sudan and V. Guruswami in 1997-1998. There are general result about the Johnson radius and the list-decoding capacity theorem for random c
The Discrete Fourier Transform (DFT) is a fundamental computational primitive, and the fastest known algorithm for computing the DFT is the FFT (Fast Fourier Transform) algorithm. One remarkable feature of FFT is the fact that its runtime depends onl