Sum-of-Squares Polynomial Flow

127 0 0.0 ( 0 )

Download Cite

Added by Priyank Jaini

Publication date 2019

fields Informatics Engineering Mathematical Statistics

and research's language is English

Authors Priyank Jaini - Kira A. Selby - Yaoliang Yu

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Triangular map is a recent construct in probability theory that allows one to transform any source probability density function to any target density function. Based on triangular maps, we propose a general framework for high-dimensional density estimation, by specifying one-dimensional transformations (equivalently conditional densities) and appropriate conditioner networks. This framework (a) reveals the commonalities and differences of existing autoregressive and flow based methods, (b) allows a unified understanding of the limitations and representation power of these recent approaches and, (c) motivates us to uncover a new Sum-of-Squares (SOS) flow that is interpretable, universal, and easy to train. We perform several synthetic experiments on various density geometries to demonstrate the benefits (and short-comings) of such transformations. SOS flows achieve competitive results in simulations and several real-world datasets.

rate research

Exact tensor completion with sum-of-squares

75 - Aaron Potechin , David Steurer 2017

We obtain the first polynomial-time algorithm for exact tensor completion that improves over the bound implied by reduction to matrix completion. The algorithm recovers an unknown 3-tensor with $r$ incoherent, orthogonal components in $mathbb R^n$ from $rcdot tilde O(n^{1.5})$ randomly observed entries of the tensor. This bound improves over the previous best one of $rcdot tilde O(n^{2})$ by reduction to exact matrix completion. Our bound also matches the best known results for the easier problem of approximate tensor completion (Barak & Moitra, 2015). Our algorithm and analysis extends seminal results for exact matrix completion (Candes & Recht, 2009) to the tensor setting via the sum-of-squares method. The main technical challenge is to show that a small number of randomly chosen monomials are enough to construct a degree-3 polynomial with precisely planted orthogonal global optima over the sphere and that this fact can be certified within the sum-of-squares proof system.

Machine Learning Computational Complexity Data Structures and Algorithms

Tensor principal component analysis via sum-of-squares proofs

316 - Samuel B. Hopkins , Jonathan Shi , David Steurer 2015

We study a statistical model for the tensor principal component analysis problem introduced by Montanari and Richard: Given a order-$3$ tensor $T$ of the form $T = tau cdot v_0^{otimes 3} + A$, where $tau geq 0$ is a signal-to-noise ratio, $v_0$ is a unit vector, and $A$ is a random noise tensor, the goal is to recover the planted vector $v_0$. For the case that $A$ has iid standard Gaussian entries, we give an efficient algorithm to recover $v_0$ whenever $tau geq omega(n^{3/4} log(n)^{1/4})$, and certify that the recovered vector is close to a maximum likelihood estimator, all with high probability over the random choice of $A$. The previous best algorithms with provable guarantees required $tau geq Omega(n)$. In the regime $tau leq o(n)$, natural tensor-unfolding-based spectral relaxations for the underlying optimization problem break down (in the sense that their integrality gap is large). To go beyond this barrier, we use convex relaxations based on the sum-of-squares method. Our recovery algorithm proceeds by rounding a degree-$4$ sum-of-squares relaxations of the maximum-likelihood-estimation problem for the statistical model. To complement our algorithmic results, we show that degree-$4$ sum-of-squares relaxations break down for $tau leq O(n^{3/4}/log(n)^{1/4})$, which demonstrates that improving our current guarantees (by more than logarithmic factors) would require new techniques or might even be intractable. Finally, we show how to exploit additional problem structure in order to solve our sum-of-squares relaxations, up to some approximation, very efficiently. Our fastest algorithm runs in nearly-linear time using shifted (matrix) power iteration and has similar guarantees as above. The analysis of this algorithm also confirms a variant of a conjecture of Montanari and Richard about singular vectors of tensor unfoldings.

Machine Learning Computational Complexity Data Structures and Algorithms

Noisy Tensor Completion via the Sum-of-Squares Hierarchy

305 - Boaz Barak , Ankur Moitra 2015

In the noisy tensor completion problem we observe $m$ entries (whose location is chosen uniformly at random) from an unknown $n_1 times n_2 times n_3$ tensor $T$. We assume that $T$ is entry-wise close to being rank $r$. Our goal is to fill in its missing entries using as few observations as possible. Let $n = max(n_1, n_2, n_3)$. We show that if $m = n^{3/2} r$ then there is a polynomial time algorithm based on the sixth level of the sum-of-squares hierarchy for completing it. Our estimate agrees with almost all of $T$s entries almost exactly and works even when our observations are corrupted by noise. This is also the first algorithm for tensor completion that works in the overcomplete case when $r > n$, and in fact it works all the way up to $r = n^{3/2-epsilon}$. Our proofs are short and simple and are based on establishing a new connection between noisy tensor completion (through the language of Rademacher complexity) and the task of refuting random constant satisfaction problems. This connection seems to have gone unnoticed even in the context of matrix completion. Furthermore, we use this connection to show matching lower bounds. Our main technical result is in characterizing the Rademacher complexity of the sequence of norms that arise in the sum-of-squares relaxations to the tensor nuclear norm. These results point to an interesting new direction: Can we explore computational vs. sample complexity tradeoffs through the sum-of-squares hierarchy?

Machine Learning Data Structures and Algorithms Machine Learning

Rounding Sum-of-Squares Relaxations

141 - Boaz Barak , Jonathan Kelner , David Steurer 2013

We present a general approach to rounding semidefinite programming relaxations obtained by the Sum-of-Squares method (Lasserre hierarchy). Our approach is based on using the connection between these relaxations and the Sum-of-Squares proof system to transform a *combining algorithm* -- an algorithm that maps a distribution over solutions into a (possibly weaker) solution -- into a *rounding algorithm* that maps a solution of the relaxation to a solution of the original problem. Using this approach, we obtain algorithms that yield improved results for natural variants of three well-known problems: 1) We give a quasipolynomial-time algorithm that approximates the maximum of a low degree multivariate polynomial with non-negative coefficients over the Euclidean unit sphere. Beyond being of interest in its own right, this is related to an open question in quantum information theory, and our techniques have already led to improved results in this area (Brand~{a}o and Harrow, STOC 13). 2) We give a polynomial-time algorithm that, given a d dimensional subspace of R^n that (almost) contains the characteristic function of a set of size n/k, finds a vector $v$ in the subspace satisfying $|v|_4^4 > c(k/d^{1/3}) |v|_2^2$, where $|v|_p = (E_i v_i^p)^{1/p}$. Aside from being a natural relaxation, this is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012) and our results yield a certain improvement for that problem. 3) We use this notion of L_4 vs. L_2 sparsity to obtain a polynomial-time algorithm with substantially improved guarantees for recovering a planted $mu$-sparse vector v in a random d-dimensional subspace of R^n. If v has mu n nonzero coordinates, we can recover it with high probability whenever $mu < O(min(1,n/d^2))$, improving for $d < n^{2/3}$ prior methods which intrinsically required $mu < O(1/sqrt(d))$.

Data Structures and Algorithms Machine Learning Quantum Physics

Association schemes, non-commutative polynomial concentration, and sum-of-squares lower bounds for planted clique

226 - Raghu Meka , Avi Wigderson 2013

Finding cliques in random graphs and the closely related planted clique variant, where a clique of size t is planted in a random G(n,1/2) graph, have been the focus of substantial study in algorithm design. Despite much effort, the best known polynomial-time algorithms only solve the problem for t = Theta(sqrt(n)). Here we show that beating sqrt(n) would require substantially new algorithmic ideas, by proving a lower bound for the problem in the sum-of-squares (or Lasserre) hierarchy, the most powerful class of semi-definite programming algorithms we know of: r rounds of the sum-of-squares hierarchy can only solve the planted clique for t > sqrt(n)/(C log n)^(r^2). Previously, no nontrivial lower bounds were known. Our proof is formulated as a degree lower bound in the Positivstellensatz algebraic proof system, which is equivalent to the sum-of-squares hierarchy. The heart of our (average-case) lower bound is a proof that a certain random matrix derived from the input graph is (with high probability) positive semidefinite. Two ingredients play an important role in this proof. The first is the classical theory of association schemes, applied to the average and variance of that random matrix. The second is a new large deviation inequality for matrix-valued polynomials. Our new tail estimate seems to be of independent interest and may find other applications, as it generalizes both the estimates on real-valued polynomials and on sums of independent random matrices.

Computational Complexity Data Structures and Algorithms Combinatorics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Sum-of-Squares Polynomial Flow

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions