ترغب بنشر مسار تعليمي؟ اضغط هنا

Sparse Ternary Codes for similarity search have higher coding gain than dense binary codes

85   0   0.0 ( 0 )
 نشر من قبل Sohrab Ferdowsi
 تاريخ النشر 2017
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper addresses the problem of Approximate Nearest Neighbor (ANN) search in pattern recognition where feature vectors in a database are encoded as compact codes in order to speed-up the similarity search in large-scale databases. Considering the ANN problem from an information-theoretic perspective, we interpret it as an encoding, which maps the original feature vectors to a less entropic sparse representation while requiring them to be as informative as possible. We then define the coding gain for ANN search using information-theoretic measures. We next show that the classical approach to this problem, which consists of binarization of the projected vectors is sub-optimal. Instead, a properly designed ternary encoding achieves higher coding gains and lower complexity.



قيم البحث

اقرأ أيضاً

One of the most important and challenging problems in coding theory is to construct codes with best possible parameters and properties. The class of quasi-cyclic (QC) codes is known to be fertile to produce such codes. Focusing on QC codes over the b inary field, we have found 113 binary QC codes that are new among the class of QC codes using an implementation of a fast cyclic partitioning algorithm and the highly effective ASR algorithm. Moreover, these codes have the following additional properties: a) they have the same parameters as best known linear codes, and b) many of the have additional desired properties such as being reversible, LCD, self-orthogonal or dual-containing. Additionally, we present an algorithm for the generation of new codes from QC codes using ConstructionX, and introduce 35 new record breaking linear codes produced from this method.
45 - Haibo Liu , Qunying Liao 2021
Recently, minimal linear codes have been extensively studied due to their applications in secret sharing schemes, two-party computations, and so on. Constructing minimal linear codes violating the Ashikhmin-Barg condition and determining their weight distributions have been an interesting research topic in coding theory and cryptography. In this paper, basing on exponential sums and Krawtchouk polynomials, we first prove that $g_{(m,k)}$ in cite{Heng-Ding-Zhou}, which is the characteristic function of some subset in $mathbb{F}_3^m$, can be generalized to be $f{(m,k)}$ for obtaining a minimal linear code violating the Ashikhmin-Barg condition; secondly, we employ $overline{g}_{(m,k)}$ to construct a class of ternary minimal linear codes violating the Ashikhmin-Barg condition, whose minimal distance is better than that of codes in cite{Heng-Ding-Zhou}.
We consider network coding for networks experiencing worst-case bit-flip errors, and argue that this is a reasonable model for highly dynamic wireless network transmissions. We demonstrate that in this setup prior network error-correcting schemes can be arbitrarily far from achieving the optimal network throughput. We propose a new metric for errors under this model. Using this metric, we prove a new Hamming-type upper bound on the network capacity. We also show a commensurate lower bound based on GV-type codes that can be used for error-correction. The codes used to attain the lower bound are non-coherent (do not require prior knowledge of network topology). The end-to-end nature of our design enables our codes to be overlaid on classical distributed random linear network codes. Further, we free internal nodes from having to implement potentially computationally intensive link-by-link error-correction.
We prove that, for the binary erasure channel (BEC), the polar-coding paradigm gives rise to codes that not only approach the Shannon limit but do so under the best possible scaling of their block length as a~function of the gap to capacity. This res ult exhibits the first known family of binary codes that attain both optimal scaling and quasi-linear complexity of encoding and decoding. Our proof is based on the construction and analysis of binary polar codes with large kernels. When communicating reliably at rates within $varepsilon > 0$ of capacity, the code length $n$ often scales as $O(1/varepsilon^{mu})$, where the constant $mu$ is called the scaling exponent. It is known that the optimal scaling exponent is $mu=2$, and it is achieved by random linear codes. The scaling exponent of conventional polar codes (based on the $2times 2$ kernel) on the BEC is $mu=3.63$. This falls far short of the optimal scaling guaranteed by random codes. Our main contribution is a rigorous proof of the following result: for the BEC, there exist $elltimesell$ binary kernels, such that polar codes constructed from these kernels achieve scaling exponent $mu(ell)$ that tends to the optimal value of $2$ as $ell$ grows. We furthermore characterize precisely how large $ell$ needs to be as a function of the gap between $mu(ell)$ and $2$. The resulting binary codes maintain the recursive structure of conventional polar codes, and thereby achieve construction complexity $O(n)$ and encoding/decoding complexity $O(nlog n)$.
The subject of this paper is transmission over a general class of binary-input memoryless symmetric channels using error correcting codes based on sparse graphs, namely low-density generator-matrix and low-density parity-check codes. The optimal (or ideal) decoder based on the posterior measure over the code bits, and its relationship to the sub-optimal belief propagation decoder, are investigated. We consider the correlation (or covariance) between two codebits, averaged over the noise realizations, as a function of the graph distance, for the optimal decoder. Our main result is that this correlation decays exponentially fast for fixed general low-density generator-matrix codes and high enough noise parameter, and also for fixed general low-density parity-check codes and low enough noise parameter. This has many consequences. Appropriate performance curves - called GEXIT functions - of the belief propagation and optimal decoders match in high/low noise regimes. This means that in high/low noise regimes the performance curves of the optimal decoder can be computed by density evolution. Another interpretation is that the replica predictions of spin-glass theory are exact. Our methods are rather general and use cluster expansions first developed in the context of mathematical statistical mechanics.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا