No Arabic abstract
The goal of combinatorial group testing is to efficiently identify up to $d$ defective items in a large population of $n$ items, where $d ll n$. Defective items satisfy certain properties while the remaining items in the population do not. To efficiently identify defective items, a subset of items is pooled and then tested. In this work, we consider complex group testing (CmplxGT) in which a set of defective items consists of subsets of positive items (called textit{positive complexes}). CmplxGT is classified into two categories: classical CmplxGT (CCmplxGT) and generalized CmplxGT (GCmplxGT). In CCmplxGT, the outcome of a test on a subset of items is positive if the subset contains at least one positive complex, and negative otherwise. In GCmplxGT, the outcome of a test on a subset of items is positive if the subset has a certain number of items of some positive complex, and negative otherwise. For CCmplxGT, we present a scheme that efficiently identifies all positive complexes in time $t times mathrm{poly}(d, ln{n})$ in the presence of erroneous outcomes, where $t$ is a predefined parameter. As $d ll n$, this is significantly better than the currently best time of $mathrm{poly}(t) times O(n ln{n})$. Moreover, in specific cases, the number of tests in our proposed scheme is smaller than previous work. For GCmplxGT, we present a scheme that efficiently identifies all positive complexes. These schemes are directly applicable in various areas such as complex disease genetics, molecular biology, and learning a hidden graph.
The task of non-adaptive group testing is to identify up to $d$ defective items from $N$ items, where a test is positive if it contains at least one defective item, and negative otherwise. If there are $t$ tests, they can be represented as a $t times N$ measurement matrix. We have answered the question of whether there exists a scheme such that a larger measurement matrix, built from a given $ttimes N$ measurement matrix, can be used to identify up to $d$ defective items in time $O(t log_2{N})$. In the meantime, a $t times N$ nonrandom measurement matrix with $t = O left(frac{d^2 log_2^2{N}}{(log_2(dlog_2{N}) - log_2{log_2(dlog_2{N})})^2} right)$ can be obtained to identify up to $d$ defective items in time $mathrm{poly}(t)$. This is much better than the best well-known bound, $t = O left( d^2 log_2^2{N} right)$. For the special case $d = 2$, there exists an efficient nonrandom construction in which at most two defective items can be identified in time $4log_2^2{N}$ using $t = 4log_2^2{N}$ tests. Numerical results show that our proposed scheme is more practical than existing ones, and experimental results confirm our theoretical analysis. In particular, up to $2^{7} = 128$ defective items can be identified in less than $16$s even for $N = 2^{100}$.
Identification of up to $d$ defective items and up to $h$ inhibitors in a set of $n$ items is the main task of non-adaptive group testing with inhibitors. To efficiently reduce the cost of this Herculean task, a subset of the $n$ items is formed and then tested. This is called textit{group testing}. A test outcome on a subset of items is positive if the subset contains at least one defective item and no inhibitors, and negative otherwise. We present two decoding schemes for efficiently identifying the defective items and the inhibitors in the presence of $e$ erroneous outcomes in time $mathsf{poly}(d, h, e, log_2{n})$, which is sublinear to the number of items $n$. This decoding complexity significantly improves the state-of-the-art schemes in which the decoding time is linear to the number of items $n$, i.e., $mathsf{poly}(d, h, e, n)$. Moreover, each column of the measurement matrices associated with the proposed schemes can be nonrandomly generated in polynomial order of the number of rows. As a result, one can save space for storing them. Simulation results confirm our theoretical analysis. When the number of items is sufficiently large, the decoding time in our proposed scheme is smallest in comparison with existing work. In addition, when some erroneous outcomes are allowed, the number of tests in the proposed scheme is often smaller than the number of tests in existing work.
The goal of non-adaptive group testing is to identify at most $d$ defective items from $N$ items, in which a test of a subset of $N$ items is positive if it contains at least one defective item, and negative otherwise. However, in many cases, especially in biological screening, the outcome is unreliable due to biochemical interaction; i.e., textit{noise.} Consequently, a positive result can change to a negative one (false negative) and vice versa (false positive). In this work, we first consider the dilution effect in which textit{the degree of noise depends on the number of items in the test}. Two efficient schemes are presented for identifying the defective items in time linearly to the number of tests needed. Experimental results validate our theoretical analysis. Specifically, setting the error precision of 0.001 and $dleq16$, our proposed algorithms always identify all defective items in less than 7 seconds for $N=2^{33}approx 9$ billion.
The goal of threshold group testing is to identify up to $d$ defective items among a population of $n$ items, where $d$ is usually much smaller than $n$. A test is positive if it has at least $u$ defective items and negative otherwise. Our objective is to identify defective items in sublinear time the number of items, e.g., $mathrm{poly}(d, ln{n}),$ by using the number of tests as low as possible. In this paper, we reduce the number of tests to $O left( h times frac{d^2 ln^2{n}}{mathsf{W}^2(d ln{n})} right)$ and the decoding time to $O left( mathrm{dec}_0 times h right),$ where $mathrm{dec}_0 = O left( frac{d^{3.57} ln^{6.26}{n}}{mathsf{W}^{6.26}(d ln{n})} right) + O left( frac{d^6 ln^4{n}}{mathsf{W}^4(d ln{n})} right)$, $h = Oleft( frac{d_0^2 ln{frac{n}{d_0}}}{(1-p)^2} right)$ , $d_0 = max{u, d - u }$, $p in [0, 1),$ and $mathsf{W}(x) = Theta left( ln{x} - ln{ln{x}} right).$ If the number of tests is increased to $Oleft( h times frac{d^2ln^3{n}}{mathsf{W}^2(d ln{n})} right),$ the decoding complexity is reduced to $O left(mathrm{dec}_1 times h right),$ where $mathrm{dec}_1 = max left{ frac{d^2 ln^3{n}}{mathsf{W}^2(d ln{n})}, frac{ud ln^4{n}}{mathsf{W}^3(d ln{n})} right}.$ Moreover, our proposed scheme is capable of handling errors in test outcomes.
The main goal of group testing with inhibitors (GTI) is to efficiently identify a small number of defective items and inhibitor items in a large set of items. A test on a subset of items is positive if the subset satisfies some specific properties. Inhibitor items cancel the effects of defective items, which often make the outcome of a test containing defective items negative. Different GTI models can be formulated by considering how specific properties have different cancellation effects. This work introduces generalized GTI (GGTI) in which a new type of items is added, i.e., hybrid items. A hybrid item plays the roles of both defectives items and inhibitor items. Since the number of instances of GGTI is large (more than 7 million), we introduce a framework for classifying all types of items non-adaptively, i.e., all tests are designed in advance. We then explain how GGTI can be used to classify neurons in neuroscience. Finally, we show how to realize our proposed scheme in practice.