أوراق بحثية, رسائل ماجستير ودكتوراه منشورة من قبل Pravesh K. Kothari

Algorithms and Certificates for Boolean CSP Refutation: Smoothed is no harder than Random

450 - Venkatesan Guruswami , Pravesh K. Kothari , Peter Manohar 2021

We present an algorithm for strongly refuting smoothed instances of all Boolean CSPs. The smoothed model is a hybrid between worst and average-case input models, where the input is an arbitrary instance of the CSP with only the negation patterns of t he literals re-randomized with some small probability. For an $n$-variable smoothed instance of a $k$-arity CSP, our algorithm runs in $n^{O(ell)}$ time, and succeeds with high probability in bounding the optimum fraction of satisfiable constraints away from $1$, provided that the number of constraints is at least $tilde{O}(n) (frac{n}{ell})^{frac{k}{2} - 1}$. This matches, up to polylogarithmic factors in $n$, the trade-off between running time and the number of constraints of the state-of-the-art algorithms for refuting fully random instances of CSPs [RRS17]. We also make a surprising new connection between our algorithm and even covers in hypergraphs, which we use to positively resolve Feiges 2008 conjecture, an extremal combinatorics conjecture on the existence of even covers in sufficiently dense hypergraphs that generalizes the well-known Moore bound for the girth of graphs. As a corollary, we show that polynomial-size refutation witnesses exist for arbitrary smoothed CSP instances with number of constraints a polynomial factor below the spectral threshold of $n^{k/2}$, extending the celebrated result for random 3-SAT of Feige, Kim and Ofek [FKO06].

التعقيد الحسابي بنى وهياكل البيانات والخوارزميات

Memory-Sample Lower Bounds for Learning Parity with Noise

265 - Sumegha Garg , Pravesh K. Kothari , Pengda Liu 2021

In this work, we show, for the well-studied problem of learning parity under noise, where a learner tries to learn $x=(x_1,ldots,x_n) in {0,1}^n$ from a stream of random linear equations over $mathrm{F}_2$ that are correct with probability $frac{1}{2 }+varepsilon$ and flipped with probability $frac{1}{2}-varepsilon$, that any learning algorithm requires either a memory of size $Omega(n^2/varepsilon)$ or an exponential number of samples. In fact, we study memory-sample lower bounds for a large class of learning problems, as characterized by [GRT18], when the samples are noisy. A matrix $M: A times X rightarrow {-1,1}$ corresponds to the following learning problem with error parameter $varepsilon$: an unknown element $x in X$ is chosen uniformly at random. A learner tries to learn $x$ from a stream of samples, $(a_1, b_1), (a_2, b_2) ldots$, where for every $i$, $a_i in A$ is chosen uniformly at random and $b_i = M(a_i,x)$ with probability $1/2+varepsilon$ and $b_i = -M(a_i,x)$ with probability $1/2-varepsilon$ ($0<varepsilon< frac{1}{2}$). Assume that $k,ell, r$ are such that any submatrix of $M$ of at least $2^{-k} cdot |A|$ rows and at least $2^{-ell} cdot |X|$ columns, has a bias of at most $2^{-r}$. We show that any learning algorithm for the learning problem corresponding to $M$, with error, requires either a memory of size at least $Omegaleft(frac{k cdot ell}{varepsilon} right)$, or at least $2^{Omega(r)}$ samples. In particular, this shows that for a large class of learning problems, same as those in [GRT18], any learning algorithm requires either a memory of size at least $Omegaleft(frac{(log |X|) cdot (log |A|)}{varepsilon}right)$ or an exponential number of noisy samples. Our proof is based on adapting the arguments in [Raz17,GRT18] to the noisy case.

التعلم الآلي التعقيد الحسابي

A Stress-Free Sum-of-Squares Lower Bound for Coloring

143 - Pravesh K. Kothari , Peter Manohar 2021

We prove that with high probability over the choice of a random graph $G$ from the ErdH{o}s-Renyi distribution $G(n,1/2)$, a natural $n^{O(varepsilon^2 log n)}$-time, degree $O(varepsilon^2 log n)$ sum-of-squares semidefinite program cannot refute th e existence of a valid $k$-coloring of $G$ for $k = n^{1/2 +varepsilon}$. Our result implies that the refutation guarantee of the basic semidefinite program (a close variant of the Lovasz theta function) cannot be appreciably improved by a natural $o(log n)$-degree sum-of-squares strengthening, and this is tight up to a $n^{o(1)}$ slack in $k$. To the best of our knowledge, this is the first lower bound for coloring $G(n,1/2)$ for even a single round strengthening of the basic SDP in any SDP hierarchy. Our proof relies on a new variant of instance-preserving non-pointwise complete reduction within SoS from coloring a graph to finding large independent sets in it. Our proof is (perhaps surprisingly) short, simple and does not require complicated spectral norm bounds on random matrices with dependent entries that have been otherwise necessary in the proofs of many similar results [BHK+16, HKP+17, KB19, GJJ+20, MRX20]. Our result formally holds for a constraint system where vertices are allowed to belong to multiple color classes; we leave the extension to the formally stronger formulation of coloring, where vertices must belong to unique colors classes, as an outstanding open problem.

التعقيد الحسابي

Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

125 - Tommaso dOrsi , Pravesh K. Kothari , Gleb Novikov 2020

We study efficient algorithms for Sparse PCA in standard statistical models (spiked covariance in its Wishart form). Our goal is to achieve optimal recovery guarantees while being resilient to small perturbations. Despite a long history of prior work s, including explicit studies of perturbation resilience, the best known algorithmic guarantees for Sparse PCA are fragile and break down under small adversarial perturbations. We observe a basic connection between perturbation resilience and emph{certifying algorithms} that are based on certificates of upper bounds on sparse eigenvalues of random matrices. In contrast to other techniques, such certifying algorithms, including the brute-force maximum likelihood estimator, are automatically robust against small adversarial perturbation. We use this connection to obtain the first polynomial-time algorithms for this problem that are resilient against additive adversarial perturbations by obtaining new efficient certificates for upper bounds on sparse eigenvalues of random matrices. Our algorithms are based either on basic semidefinite programming or on its low-degree sum-of-squares strengthening depending on the parameter regimes. Their guarantees either match or approach the best known guarantees of emph{fragile} algorithms in terms of sparsity of the unknown vector, number of samples and the ambient dimension. To complement our algorithmic results, we prove rigorous lower bounds matching the gap between fragile and robust polynomial-time algorithms in a natural computational model based on low-degree polynomials (closely related to the pseudo-calibration technique for sum-of-squares lower bounds) that is known to capture the best known guarantees for related statistical estimation problems. The combination of these results provides formal evidence of an inherent price to pay to achieve robustness.

التعلم الآلي بنى وهياكل البيانات والخوارزميات

Strongly refuting all semi-random Boolean CSPs

178 - Jackson Abascal , Venkatesan Guruswami , Pravesh K. Kothari 2020

We give an efficient algorithm to strongly refute emph{semi-random} instances of all Boolean constraint satisfaction problems. The number of constraints required by our algorithm matches (up to polylogarithmic factors) the best-known bounds for effic ient refutation of fully random instances. Our main technical contribution is an algorithm to strongly refute semi-random instances of the Boolean $k$-XOR problem on $n$ variables that have $widetilde{O}(n^{k/2})$ constraints. (In a semi-random $k$-XOR instance, the equations can be arbitrary and only the right-hand sides are random.) One of our key insights is to identify a simple combinatorial property of random XOR instances that makes spectral refutation work. Our approach involves taking an instance that does not satisfy this property (i.e., is emph{not} pseudorandom) and reducing it to a partitioned collection of $2$-XOR instances. We analyze these subinstances using a carefully chosen quadratic form as a proxy, which in turn is bounded via a combination of spectral methods and semidefinite programming. The analysis of our spectral bounds relies only on an off-the-shelf matrix Bernstein inequality. Even for the purely random case, this leads to a shorter proof compared to the ones in the literature that rely on problem-specific trace-moment computations.

التعقيد الحسابي بنى وهياكل البيانات والخوارزميات

Time-Space Tradeoffs for Distinguishing Distributions and Applications to Security of Goldreichs PRG

186 - Sumegha Garg , Pravesh K. Kothari , Ran Raz 2020

In this work, we establish lower-bounds against memory bounded algorithms for distinguishing between natural pairs of related distributions from samples that arrive in a streaming setting. In our first result, we show that any algorithm that distin guishes between uniform distribution on ${0,1}^n$ and uniform distribution on an $n/2$-dimensional linear subspace of ${0,1}^n$ with non-negligible advantage needs $2^{Omega(n)}$ samples or $Omega(n^2)$ memory. Our second result applies to distinguishing outputs of Goldreichs local pseudorandom generator from the uniform distribution on the output domain. Specifically, Goldreichs pseudorandom generator $G$ fixes a predicate $P:{0,1}^k rightarrow {0,1}$ and a collection of subsets $S_1, S_2, ldots, S_m subseteq [n]$ of size $k$. For any seed $x in {0,1}^n$, it outputs $P(x_{S_1}), P(x_{S_2}), ldots, P(x_{S_m})$ where $x_{S_i}$ is the projection of $x$ to the coordinates in $S_i$. We prove that whenever $P$ is $t$-resilient (all non-zero Fourier coefficients of $(-1)^P$ are of degree $t$ or higher), then no algorithm, with $<n^epsilon$ memory, can distinguish the output of $G$ from the uniform distribution on ${0,1}^m$ with a large inverse polynomial advantage, for stretch $m le left(frac{n}{t}right)^{frac{(1-epsilon)}{36}cdot t}$ (barring some restrictions on $k$). The lower bound holds in the streaming model where at each time step $i$, $S_isubseteq [n]$ is a randomly chosen (ordered) subset of size $k$ and the distinguisher sees either $P(x_{S_i})$ or a uniformly random bit along with $S_i$. Our proof builds on the recently developed machinery for proving time-space trade-offs (Raz 2016 and follow-ups) for search/learning problems.

التعقيد الحسابي التشفير والأمن

List-Decodable Subspace Recovery: Dimension Independent Error in Polynomial Time

159 - Ainesh Bakshi , Pravesh K. Kothari 2020

In list-decodable subspace recovery, the input is a collection of $n$ points $alpha n$ (for some $alpha ll 1/2$) of which are drawn i.i.d. from a distribution $mathcal{D}$ with a isotropic rank $r$ covariance $Pi_*$ (the emph{inliers}) and the rest a re arbitrary, potential adversarial outliers. The goal is to recover a $O(1/alpha)$ size list of candidate covariances that contains a $hat{Pi}$ close to $Pi_*$. Two recent independent works (Raghavendra-Yau, Bakshi-Kothari 2020) gave the first efficient algorithm for this problem. These results, however, obtain an error that grows with the dimension (linearly in [RY] and logarithmically in BK) at the cost of quasi-polynomial running time) and rely on emph{certifiable anti-concentration} - a relatively strict condition satisfied essentially only by the Gaussian distribution. In this work, we improve on these results on all three fronts: emph{dimension-independent} error via a faster fixed-polynomial running time under less restrictive distributional assumptions. Specifically, we give a $poly(1/alpha) d^{O(1)}$ time algorithm that outputs a list containing a $hat{Pi}$ satisfying $|hat{Pi} -Pi_*|_F leq O(1/alpha)$. Our result only needs $mathcal{D}$ to have emph{certifiably hypercontractive} degree 2 polynomials. As a result, in addition to Gaussians, our algorithm applies to the uniform distribution on the hypercube and $q$-ary cubes and arbitrary product distributions with subgaussian marginals. Prior work (Raghavendra and Yau, 2020) had identified such distributions as potential hard examples as such distributions do not exhibit strong enough anti-concentration. When $mathcal{D}$ satisfies certifiable anti-concentration, we obtain a stronger error guarantee of $|hat{Pi}-Pi_*|_F leq eta$ for any arbitrary $eta > 0$ in $d^{O(poly(1/alpha) + log (1/eta))}$ time.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

List-Decodable Linear Regression

77 - Sushrut Karmalkar , Adam R. Klivans , Pravesh K. Kothari 2019

We give the first polynomial-time algorithm for robust regression in the list-decodable setting where an adversary can corrupt a greater than $1/2$ fraction of examples. For any $alpha < 1$, our algorithm takes as input a sample ${(x_i,y_i)}_{i leq n}$ of $n$ linear equations where $alpha n$ of the equations satisfy $y_i = langle x_i,ell^*rangle +zeta$ for some small noise $zeta$ and $(1-alpha)n$ of the equations are {em arbitrarily} chosen. It outputs a list $L$ of size $O(1/alpha)$ - a fixed constant - that contains an $ell$ that is close to $ell^*$. Our algorithm succeeds whenever the inliers are chosen from a emph{certifiably} anti-concentrated distribution $D$. In particular, this gives a $(d/alpha)^{O(1/alpha^8)}$ time algorithm to find a $O(1/alpha)$ size list when the inlier distribution is standard Gaussian. For discrete product distributions that are anti-concentrated only in emph{regular} directions, we give an algorithm that achieves similar guarantee under the promise that $ell^*$ has all coordinates of the same magnitude. To complement our result, we prove that the anti-concentration assumption on the inliers is information-theoretically necessary. Our algorithm is based on a new framework for list-decodable learning that strengthens the `identifiability to algorithms paradigm based on the sum-of-squares method. In an independent and concurrent work, Raghavendra and Yau also used the Sum-of-Squares method to give a similar result for list-decodable regression.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

On the Expressive Power of Kernel Methods and the Efficiency of Kernel Learning by Association Schemes

95 - Pravesh K. Kothari , Roi Livni 2019

We study the expressive power of kernel methods and the algorithmic feasibility of multiple kernel learning for a special rich class of kernels. Specifically, we define emph{Euclidean kernels}, a diverse class that includes most, if not all, famili es of kernels studied in literature such as polynomial kernels and radial basis functions. We then describe the geometric and spectral structure of this family of kernels over the hypercube (and to some extent for any compact domain). Our structural results allow us to prove meaningful limitations on the expressive power of the class as well as derive several efficient algorithms for learning kernels over different domains.

التعلم الآلي التعلم الالي

Sum-of-Squares meets Nash: Optimal Lower Bounds for Finding any Equilibrium

197 - Pravesh K. Kothari , Ruta Mehta 2018

Several works have shown unconditional hardness (via integrality gaps) of computing equilibria using strong hierarchies of convex relaxations. Such results however only apply to the problem of computing equilibria that optimize a certain objective fu nction and not to the (arguably more fundamental) task of finding emph{any} equilibrium. We present an algorithmic model based on the sum-of-squares (SoS) hierarchy that allows escaping this inherent limitation of integrality gaps. In this model, algorithms access the input game only through a relaxed solution to the natural SoS relaxation for computing equilibria. They can then adaptively construct a list of candidate solutions and invoke a verification oracle to check if any candidate on the list is a solution. This model captures most well-studied approximation algorithms such as those for Max-Cut, Sparsest Cut, and Unique-Games. The state-of-the-art algorithms for computing exact and approximate equilibria in two-player, n-strategy games are captured in this model and require that at least one of i) size (~ running time) of the SoS relaxation or ii) the size of the list of candidates, be at least $2^{Omega(n)}$ and $n^{Omega(log{(n)})}$ respectively. Our main result shows a lower bound that matches these upper bound up to constant factors in the exponent. This can be interpreted as an unconditional confirmation, in our restricted algorithmic framework, of Rubinsteins recent conditional hardness cite{Rub} for computing approximate equilibria. Our proof strategy involves constructing a family of games that all share a common sum-of-squares solution but every (approximate) equilibrium of one game is far from every (approximate) equilibrium of any other game in the family.

التعقيد الحسابي

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد