No Arabic abstract
Two polynomials $f, g in mathbb{F}[x_1, ldots, x_n]$ are called shift-equivalent if there exists a vector $(a_1, ldots, a_n) in mathbb{F}^n$ such that the polynomial identity $f(x_1+a_1, ldots, x_n+a_n) equiv g(x_1,ldots,x_n)$ holds. Our main result is a new randomized algorithm that tests whether two given polynomials are shift equivalent. Our algorithm runs in time polynomial in the circuit size of the polynomials, to which it is given black box access. This complements a previous work of Grigoriev (Theoretical Computer Science, 1997) who gave a deterministic algorithm running in time $n^{O(d)}$ for degree $d$ polynomials. Our algorithm uses randomness only to solve instances of the Polynomial Identity Testing (PIT) problem. Hence, if one could de-randomize PIT (a long-standing open problem in complexity) a de-randomization of our algorithm would follow. This establishes an equivalence between de-randomizing shift-equivalence testing and de-randomizing PIT (both in the black-box and the white-box setting). For certain restricted models, such as Read Once Branching Programs, we already obtain a deterministic algorithm using existing PIT results.
In this work, we introduce statistical testing under distributional shifts. We are interested in the hypothesis $P^* in H_0$ for a target distribution $P^*$, but observe data from a different distribution $Q^*$. We assume that $P^*$ is related to $Q^*$ through a known shift $tau$ and formally introduce hypothesis testing in this setting. We propose a general testing procedure that first resamples from the observed data to construct an auxiliary data set and then applies an existing test in the target domain. We prove that if the size of the resample is at most $o(sqrt{n})$ and the resampling weights are well-behaved, this procedure inherits the pointwise asymptotic level and power from the target test. If the map $tau$ is estimated from data, we can maintain the above guarantees under mild conditions if the estimation works sufficiently well. We further extend our results to uniform asymptotic level and a different resampling scheme. Testing under distributional shifts allows us to tackle a diverse set of problems. We argue that it may prove useful in reinforcement learning and covariate shift, we show how it reduces conditional to unconditional independence testing and we provide example applications in causal inference.
The well-known DeMillo-Lipton-Schwartz-Zippel lemma says that $n$-variate polynomials of total degree at most $d$ over grids, i.e. sets of the form $A_1 times A_2 times cdots times A_n$, form error-correcting codes (of distance at least $2^{-d}$ provided $min_i{|A_i|}geq 2$). In this work we explore their local decodability and (tolerant) local testability. While these aspects have been studied extensively when $A_1 = cdots = A_n = mathbb{F}_q$ are the same finite field, the setting when $A_i$s are not the full field does not seem to have been explored before. In this work we focus on the case $A_i = {0,1}$ for every $i$. We show that for every field (finite or otherwise) there is a test whose query complexity depends only on the degree (and not on the number of variables). In contrast we show that decodability is possible over fields of positive characteristic (with query complexity growing with the degree of the polynomial and the characteristic), but not over the reals, where the query complexity must grow with $n$. As a consequence we get a natural example of a code (one with a transitive group of symmetries) that is locally testable but not locally decodable. Classical results on local decoding and testing of polynomials have relied on the 2-transitive symmetries of the space of low-degree polynomials (under affine transformations). Grids do not possess this symmetry: So we introduce some new techniques to overcome this handicap and in particular use the hypercontractivity of the (constant weight) noise operator on the Hamming cube.
This paper deals with the partition function of the Ising model from statistical mechanics, which is used to study phase transitions in physical systems. A special case of interest is that of the Ising model with constant energies and external field. One may consider such an Ising system as a simple graph together with vertex and edge weights. When these weights are considered indeterminates, the partition function for the constant case is a trivariate polynomial Z(G;x,y,z). This polynomial was studied with respect to its approximability by L. A. Goldberg, M. Jerrum and M. Paterson in 2003. Z(G;x,y,z) generalizes a bivariate polynomial Z(G;t,y), which was studied by D. Andr{e}n and K. Markstr{o}m in 2009. We consider the complexity of Z(G;t,y) and Z(G;x,y,z) in comparison to that of the Tutte polynomial, which is well-known to be closely related to the Potts model in the absence of an external field. We show that Z(G;x,y,z) is #P-hard to evaluate at all points in $mathbb{Q}^3$, except those in an exception set of low dimension, even when restricted to simple graphs which are bipartite and planar. A counting version of the Exponential Time Hypothesis, #ETH, was introduced by H. Dell, T. Husfeldt and M. Wahl{e}n in 2010 in order to study the complexity of the Tutte polynomial. In analogy to their results, we give a dichotomy theorem stating that evaluations of Z(G;t,y) either take exponential time in the number of vertices of $G$ to compute, or can be done in polynomial time. Finally, we give an algorithm for computing Z(G;x,y,z) in polynomial time on graphs of bounded clique-width, which is not known in the case of the Tutte polynomial.
We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate significantly from known Fourier-based methods, which require the underlying distribution to have some product structure. Our main application is the first polynomial-time algorithm for agnostically learning any function of a constant number of halfspaces with respect to any log-concave distribution (for any constant accuracy parameter). This result was not known even for the case of learning the intersection of two halfspaces without noise. Additionally, we show that in the smoothed-analysis setting, the above results hold with respect to distributions that have sub-exponential tails, a property satisfied by many natural and well-studied distributions in machine learning. Given that our algorithms can be implemented using Support Vector Machines (SVMs) with a polynomial kernel, these results give a rigorous theoretical explanation as to why many kernel methods work so well in practice.
We investigate bisimulation equivalence on Petri nets under durational semantics. Our motivation was to verify the conjecture that in durational setting, the bisimulation equivalence checking problem becomes more tractable than in ordinary setting (which is the case, e.g., over communication-free nets). We disprove this conjecture in three of four proposed variants of durational semantics. The fourth variant remains an intriguing open problem.