No Arabic abstract
Sequencing a DNA strand, as part of the read process in DNA storage, produces multiple noisy copies which can be combined to produce better estimates of the original strand; this is called trace reconstruction. One can reduce the error rate further by introducing redundancy in the write sequence and this is called coded trace reconstruction. In this paper, we model the DNA storage channel as an insertion-deletion-substitution (IDS) channel and design both encoding schemes and low-complexity decoding algorithms for coded trace reconstruction. We introduce Trellis BMA, a new reconstruction algorithm whose complexity is linear in the number of traces, and compare its performance to previous algorithms. Our results show that it reduces the error rate on both simulated and experimental data. The performance comparisons in this paper are based on a new dataset of traces that will be publicly released with the paper. Our hope is that this dataset will enable research progress by allowing objective comparisons between candidate algorithms.
We propose an enhanced version of trellis coded multiple access (TCMA), an overloaded multiple access scheme that outperforms the original TCMA in terms of achieved spectral efficiency. Enhanced TCMA (ETCMA) performs simultaneous transmission of multiple data streams intended for users experiencing similar signal-to-noise ratios and can be employed both in the uplink and in the downlink of wireless systems, thus overcoming one of the main limitations of TCMA. Thanks to a new receiver algorithm, ETCMA is capable of delivering a significantly higher spectral efficiency. We show that ETCMA approaches the capacity of the Additive White Gaussian Noise channel for a wide range of signal-to-noise ratios.
In this paper, code pairs based on trellis coded modulation are proposed over PSK signal sets for a two-user Gaussian multiple access channel. In order to provide unique decodability property to the receiver and to maximally enlarge the constellation constrained (CC) capacity region, a relative angle of rotation is introduced between the signal sets. Subsequently, the structure of the textit{sum alphabet} of two PSK signal sets is exploited to prove that Ungerboeck labelling on the trellis of each user maximizes the guaranteed minimum squared Euclidean distance, $d^{2}_{g, min}$ in the textit{sum trellis}. Hence, such a labelling scheme can be used systematically to construct trellis code pairs for a two-user GMAC to approach emph{any rate pair} within the capacity region.
Sparse intersymbol-interference (ISI) channels are encountered in a variety of high-data-rate communication systems. Such channels have a large channel memory length, but only a small number of significant channel coefficients. In this paper, trellis-based equalization of sparse ISI channels is revisited. Due to the large channel memory length, the complexity of maximum-likelihood detection, e.g., by means of the Viterbi algorithm (VA), is normally prohibitive. In the first part of the paper, a unified framework based on factor graphs is presented for complexity reduction without loss of optimality. In this new context, two known reduced-complexity algorithms for sparse ISI channels are recapitulated: The multi-trellis VA (M-VA) and the parallel-trellis VA (P-VA). It is shown that the M-VA, although claimed, does not lead to a reduced computational complexity. The P-VA, on the other hand, leads to a significant complexity reduction, but can only be applied for a certain class of sparse channels. In the second part of the paper, a unified approach is investigated to tackle general sparse channels: It is shown that the use of a linear filter at the receiver renders the application of standard reduced-state trellis-based equalizer algorithms feasible, without significant loss of optimality. Numerical results verify the efficiency of the proposed receiver structure.
Let G be a finite strongly connected aperiodic directed graph in which each edge carries a label from a finite alphabet A. Then G induces a trellis coded quantizer for encoding an alphabet A memoryless source. A source sequence of long finite length is encoded by finding a path in G of that length whose sequence of labels is closest in Hamming distance to the source sequence; finding the minimum distance path is a dynamic programming problem that is solved using the Viterbi algorithm. We show how a Markov chain can be used to obtain a closed form expression for the asymptotic expected Hamming distortion per sample that results as the number of encoded source samples increases without bound.
In the emerging field of DNA storage, data is encoded as DNA sequences and stored. The data is read out again by sequencing the stored DNA. Nanopore sequencing is a new sequencing technology that has many advantages over other methods; in particular, it is cheap, portable, and can support longer reads. While several practical coding schemes have been developed for DNA storage with nanopore sequencing, the theory is not well understood. Towards that end, we study a highly abstracted (deterministic) version of the nanopore sequencer, which highlights key features that make its analysis difficult. We develop methods and theory to understand the capacity of our abstracted model, and we propose efficient coding schemes and algorithms.