No Arabic abstract
In the group testing problem the aim is to identify a small set of $ksim n^theta$ infected individuals out of a population size $n$, $0<theta<1$. We avail ourselves of a test procedure capable of testing groups of individuals, with the test returning a positive result iff at least one individual in the group is infected. The aim is to devise a test design with as few tests as possible so that the set of infected individuals can be identified correctly with high probability. We establish an explicit sharp information-theoretic/algorithmic phase transition $minf$ for non-adaptive group testing, where all tests are conducted in parallel. Thus, with more than $minf$ tests the infected individuals can be identified in polynomial time whp, while learning the set of infected individuals is information-theoretically impossible with fewer tests. In addition, we develop an optimal adaptive scheme where the tests are conducted in two stages.
In the group testing problem we aim to identify a small number of infected individuals within a large population. We avail ourselves to a procedure that can test a group of multiple individuals, with the test result coming out positive iff at least one individual in the group is infected. With all tests conducted in parallel, what is the least number of tests required to identify the status of all individuals? In a recent test design [Aldridge et al. 2016] the individuals are assigned to test groups randomly, with every individual joining an equal number of groups. We pinpoint the sharp threshold for the number of tests required in this randomised design so that it is information-theoretically possible to infer the infection status of every individual. Moreover, we analyse two efficient inference algorithms. These results settle conjectures from [Aldridge et al. 2014, Johnson et al. 2019].
A function $fcolon{0,1}^nto {0,1}$ is called an approximate AND-homomorphism if choosing ${bf x},{bf y}in{0,1}^n$ randomly, we have that $f({bf x}land {bf y}) = f({bf x})land f({bf y})$ with probability at least $1-epsilon$, where $xland y = (x_1land y_1,ldots,x_nland y_n)$. We prove that if $fcolon {0,1}^n to {0,1}$ is an approximate AND-homomorphism, then $f$ is $delta$-close to either a constant function or an AND function, where $delta(epsilon) to 0$ as $epsilonto0$. This improves on a result of Nehama, who proved a similar statement in which $delta$ depends on $n$. Our theorem implies a strong result on judgement aggregation in computational social choice. In the language of social choice, our result shows that if $f$ is $epsilon$-close to satisfying judgement aggregation, then it is $delta(epsilon)$-close to an oligarchy (the name for the AND function in social choice theory). This improves on Nehamas result, in which $delta$ decays polynomially with $n$. Our result follows from a more general one, in which we characterize approximate solutions to the eigenvalue equation $mathrm T f = lambda g$, where $mathrm T$ is the downwards noise operator $mathrm T f(x) = mathbb{E}_{{bf y}}[f(x land {bf y})]$, $f$ is $[0,1]$-valued, and $g$ is ${0,1}$-valued. We identify all exact solutions to this equation, and show that any approximate solution in which $mathrm T f$ and $lambda g$ are close is close to an exact solution.
A Lattice is a partially ordered set where both least upper bound and greatest lower bound of any pair of elements are unique and exist within the set. K{o}tter and Kschischang proved that codes in the linear lattice can be used for error and erasure-correction in random networks. Codes in the linear lattice have previously been shown to be special cases of codes in modular lattices. Two well known classifications of modular lattices are geometric and distributive lattices. We have identified the unique criterion which makes a geometric lattice distributive, thus characterizing all finite geometric distributive lattices. Our characterization helps to prove a conjecture regarding the maximum size of a distributive sublattice of a finite geometric lattice and identify the maximal case. The Whitney numbers of the class of geometric distributive lattices are also calculated. We present a few other applications of this unique characterization to derive certain results regarding linearity and complements in the linear lattice.
We study the problem of testing identity against a given distribution with a focus on the high confidence regime. More precisely, given samples from an unknown distribution $p$ over $n$ elements, an explicitly given distribution $q$, and parameters $0< epsilon, delta < 1$, we wish to distinguish, {em with probability at least $1-delta$}, whether the distributions are identical versus $varepsilon$-far in total variation distance. Most prior work focused on the case that $delta = Omega(1)$, for which the sample complexity of identity testing is known to be $Theta(sqrt{n}/epsilon^2)$. Given such an algorithm, one can achieve arbitrarily small values of $delta$ via black-box amplification, which multiplies the required number of samples by $Theta(log(1/delta))$. We show that black-box amplification is suboptimal for any $delta = o(1)$, and give a new identity tester that achieves the optimal sample complexity. Our new upper and lower bounds show that the optimal sample complexity of identity testing is [ Thetaleft( frac{1}{epsilon^2}left(sqrt{n log(1/delta)} + log(1/delta) right)right) ] for any $n, varepsilon$, and $delta$. For the special case of uniformity testing, where the given distribution is the uniform distribution $U_n$ over the domain, our new tester is surprisingly simple: to test whether $p = U_n$ versus $d_{mathrm TV}(p, U_n) geq varepsilon$, we simply threshold $d_{mathrm TV}(widehat{p}, U_n)$, where $widehat{p}$ is the empirical probability distribution. The fact that this simple plug-in estimator is sample-optimal is surprising, even in the constant $delta$ case. Indeed, it was believed that such a tester would not attain sublinear sample complexity even for constant values of $varepsilon$ and $delta$.
We consider sequential hypothesis testing between two quantum states using adaptive and non-adaptive strategies. In this setting, samples of an unknown state are requested sequentially and a decision to either continue or to accept one of the two hypotheses is made after each test. Under the constraint that the number of samples is bounded, either in expectation or with high probability, we exhibit adaptive strategies that minimize both types of misidentification errors. Namely, we show that these errors decrease exponentially (in the stopping time) with decay rates given by the measured relative entropies between the two states. Moreover, if we allow joint measurements on multiple samples, the rates are increased to the respective quantum relative entropies. We also fully characterize the achievable error exponents for non-adaptive strategies and provide numerical evidence showing that adaptive measurements are necessary to achieve our bounds under some additional assumptions.