ترغب بنشر مسار تعليمي؟ اضغط هنا

Combinatorial Group Testing and Sparse Recovery Schemes with Near-Optimal Decoding Time

94   0   0.0 ( 0 )
 نشر من قبل Vasileios Nakos
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In the long-studied problem of combinatorial group testing, one is asked to detect a set of $k$ defective items out of a population of size $n$, using $m ll n$ disjunctive measurements. In the non-adaptive setting, the most widely used combinatorial objects are disjunct and list-disjunct matrices, which define incidence matrices of test schemes. Disjunct matrices allow the identification of the exact set of defectives, whereas list disjunct matrices identify a small superset of the defectives. Apart from the combinatorial guarantees, it is often of key interest to equip measurement designs with efficient decoding algorithms. The most efficient decoders should run in sublinear time in $n$, and ideally near-linear in the number of measurements $m$. In this work, we give several constructions with an optimal number of measurements and near-optimal decoding time for the most fundamental group testing tasks, as well as for central tasks in the compressed sensing and heavy hitters literature. For many of those tasks, the previous measurement-optimal constructions needed time either quadratic in the number of measurements or linear in the universe size. Most of our results are obtained via a clean and novel approach which avoids list-recoverable codes or related complex techniques which were present in almost every state-of-the-art work on efficiently decodable constructions of such objects.



قيم البحث

اقرأ أيضاً

Identification of up to $d$ defective items and up to $h$ inhibitors in a set of $n$ items is the main task of non-adaptive group testing with inhibitors. To efficiently reduce the cost of this Herculean task, a subset of the $n$ items is formed and then tested. This is called textit{group testing}. A test outcome on a subset of items is positive if the subset contains at least one defective item and no inhibitors, and negative otherwise. We present two decoding schemes for efficiently identifying the defective items and the inhibitors in the presence of $e$ erroneous outcomes in time $mathsf{poly}(d, h, e, log_2{n})$, which is sublinear to the number of items $n$. This decoding complexity significantly improves the state-of-the-art schemes in which the decoding time is linear to the number of items $n$, i.e., $mathsf{poly}(d, h, e, n)$. Moreover, each column of the measurement matrices associated with the proposed schemes can be nonrandomly generated in polynomial order of the number of rows. As a result, one can save space for storing them. Simulation results confirm our theoretical analysis. When the number of items is sufficiently large, the decoding time in our proposed scheme is smallest in comparison with existing work. In addition, when some erroneous outcomes are allowed, the number of tests in the proposed scheme is often smaller than the number of tests in existing work.
We consider the problem of clustering a graph $G$ into two communities by observing a subset of the vertex correlations. Specifically, we consider the inverse problem with observed variables $Y=B_G x oplus Z$, where $B_G$ is the incidence matrix of a graph $G$, $x$ is the vector of unknown vertex variables (with a uniform prior) and $Z$ is a noise vector with Bernoulli$(varepsilon)$ i.i.d. entries. All variables and operations are Boolean. This model is motivated by coding, synchronization, and community detection problems. In particular, it corresponds to a stochastic block model or a correlation clustering problem with two communities and censored edges. Without noise, exact recovery (up to global flip) of $x$ is possible if and only the graph $G$ is connected, with a sharp threshold at the edge probability $log(n)/n$ for ErdH{o}s-Renyi random graphs. The first goal of this paper is to determine how the edge probability $p$ needs to scale to allow exact recovery in the presence of noise. Defining the degree (oversampling) rate of the graph by $alpha =np/log(n)$, it is shown that exact recovery is possible if and only if $alpha >2/(1-2varepsilon)^2+ o(1/(1-2varepsilon)^2)$. In other words, $2/(1-2varepsilon)^2$ is the information theoretic threshold for exact recovery at low-SNR. In addition, an efficient recovery algorithm based on semidefinite programming is proposed and shown to succeed in the threshold regime up to twice the optimal rate. For a deterministic graph $G$, defining the degree rate as $alpha=d/log(n)$, where $d$ is the minimum degree of the graph, it is shown that the proposed method achieves the rate $alpha> 4((1+lambda)/(1-lambda)^2)/(1-2varepsilon)^2+ o(1/(1-2varepsilon)^2)$, where $1-lambda$ is the spectral gap of the graph $G$.
An $(n, M)$ vector code $mathcal{C} subseteq mathbb{F}^n$ is a collection of $M$ codewords where $n$ elements (from the field $mathbb{F}$) in each of the codewords are referred to as code blocks. Assuming that $mathbb{F} cong mathbb{B}^{ell}$, the co de blocks are treated as $ell$-length vectors over the base field $mathbb{B}$. Equivalently, the code is said to have the sub-packetization level $ell$. This paper addresses the problem of constructing MDS vector codes which enable exact reconstruction of each code block by downloading small amount of information from the remaining code blocks. The repair bandwidth of a code measures the information flow from the remaining code blocks during the reconstruction of a single code block. This problem naturally arises in the context of distributed storage systems as the node repair problem [4]. Assuming that $M = |mathbb{B}|^{kell}$, the repair bandwidth of an MDS vector code is lower bounded by $big(frac{n - 1}{n - k}big)cdot ell$ symbols (over the base field $mathbb{B}$) which is also referred to as the cut-set bound [4]. For all values of $n$ and $k$, the MDS vector codes that attain the cut-set bound with the sub-packetization level $ell = (n-k)^{lceil{{n}/{(n-k)}}rceil}$ are known in the literature [23, 35]. This paper presents a construction for MDS vector codes which simultaneously ensures both small repair bandwidth and small sub-packetization level. The obtained codes have the smallest possible sub-packetization level $ell = O(n - k)$ for an MDS vector code and the repair bandwidth which is at most twice the cut-set bound. The paper then generalizes this code construction so that the repair bandwidth of the obtained codes approach the cut-set bound at the cost of increased sub-packetization level. The constructions presented in this paper give MDS vector codes which are linear over the base field $mathbb{B}$.
The goal of threshold group testing is to identify up to $d$ defective items among a population of $n$ items, where $d$ is usually much smaller than $n$. A test is positive if it has at least $u$ defective items and negative otherwise. Our objective is to identify defective items in sublinear time the number of items, e.g., $mathrm{poly}(d, ln{n}),$ by using the number of tests as low as possible. In this paper, we reduce the number of tests to $O left( h times frac{d^2 ln^2{n}}{mathsf{W}^2(d ln{n})} right)$ and the decoding time to $O left( mathrm{dec}_0 times h right),$ where $mathrm{dec}_0 = O left( frac{d^{3.57} ln^{6.26}{n}}{mathsf{W}^{6.26}(d ln{n})} right) + O left( frac{d^6 ln^4{n}}{mathsf{W}^4(d ln{n})} right)$, $h = Oleft( frac{d_0^2 ln{frac{n}{d_0}}}{(1-p)^2} right)$ , $d_0 = max{u, d - u }$, $p in [0, 1),$ and $mathsf{W}(x) = Theta left( ln{x} - ln{ln{x}} right).$ If the number of tests is increased to $Oleft( h times frac{d^2ln^3{n}}{mathsf{W}^2(d ln{n})} right),$ the decoding complexity is reduced to $O left(mathrm{dec}_1 times h right),$ where $mathrm{dec}_1 = max left{ frac{d^2 ln^3{n}}{mathsf{W}^2(d ln{n})}, frac{ud ln^4{n}}{mathsf{W}^3(d ln{n})} right}.$ Moreover, our proposed scheme is capable of handling errors in test outcomes.
The task of non-adaptive group testing is to identify up to $d$ defective items from $N$ items, where a test is positive if it contains at least one defective item, and negative otherwise. If there are $t$ tests, they can be represented as a $t times N$ measurement matrix. We have answered the question of whether there exists a scheme such that a larger measurement matrix, built from a given $ttimes N$ measurement matrix, can be used to identify up to $d$ defective items in time $O(t log_2{N})$. In the meantime, a $t times N$ nonrandom measurement matrix with $t = O left(frac{d^2 log_2^2{N}}{(log_2(dlog_2{N}) - log_2{log_2(dlog_2{N})})^2} right)$ can be obtained to identify up to $d$ defective items in time $mathrm{poly}(t)$. This is much better than the best well-known bound, $t = O left( d^2 log_2^2{N} right)$. For the special case $d = 2$, there exists an efficient nonrandom construction in which at most two defective items can be identified in time $4log_2^2{N}$ using $t = 4log_2^2{N}$ tests. Numerical results show that our proposed scheme is more practical than existing ones, and experimental results confirm our theoretical analysis. In particular, up to $2^{7} = 128$ defective items can be identified in less than $16$s even for $N = 2^{100}$.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا