بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Strategies for High-Throughput FPGA-based QC-LDPC Decoder Architecture

551 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Swapnil Mhaske

تاريخ النشر 2015

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Swapnil Mhaske - Hojin Kee - Tai Ly

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We propose without loss of generality strategies to achieve a high-throughput FPGA-based architecture for a QC-LDPC code based on a circulant-1 identity matrix construction. We present a novel representation of the parity-check matrix (PCM) providing a multi-fold throughput gain. Splitting of the node processing algorithm enables us to achieve pipelining of blocks and hence layers. By partitioning the PCM into not only layers but superlayers we derive an upper bound on the pipelining depth for the compact representation. To validate the architecture, a decoder for the IEEE 802.11n (2012) QC-LDPC is implemented on the Xilinx Kintex-7 FPGA with the help of the FPGA IP compiler [2] available in the NI LabVIEW Communication System Design Suite (CSDS) which offers an automated and systematic compilation flow where an optimized hardware implementation from the LDPC algorithm was generated in approximately 3 minutes, achieving an overall throughput of 608Mb/s (at 260MHz). As per our knowledge this is the fastest implementation of the IEEE 802.11n QC-LDPC decoder using an algorithmic compiler.

قيم البحث

85 - Nicolas Delfosse , Vivien Londe , Michael Beverland 2021

Quantum LDPC codes are a promising direction for low overhead quantum computing. In this paper, we propose a generalization of the Union-Find decoder as adecoder for quantum LDPC codes. We prove that this decoder corrects all errors with weight up to An^{alpha} for some A, {alpha} > 0 for different classes of quantum LDPC codes such as toric codes and hyperbolic codes in any dimension D geq 3 and quantum expander codes. To prove this result, we introduce a notion of covering radius which measures the spread of an error from its syndrome. We believe this notion could find application beyond the decoding problem. We also perform numerical simulations, which show that our Union-Find decoder outperforms the belief propagation decoder in the low error rate regime in the case of a quantum LDPC code with length 3600.

فيزياء الكم نظرية المعلومات نظرية المعلومات

FPGA design of a cdma2000 turbo decoder

400 - Maribell Sacanamboy Franco , Fabio G. Guerrero 2014

This paper presents the FPGA hardware design of a turbo decoder for the cdma2000 standard. The work includes a study and mathematical analysis of the turbo decoding process, based on the MAX-Log-MAP algorithm. Results of decoding for a packet size of two hundred fifty bits are presented, as well as an analysis of area versus performance, and the key variables for hardware design in turbo decoding.

هندسة العتاد

JANUS: an FPGA-based System for High Performance Scientific Computing

510 - F. Belletti , M. Cotallo , A. Cruz 2008

This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neigh bor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not limited to, the requirements of a class of hard scientific applications characterized by regular code structure, unconventional data manipulation instructions and not too large data-base size. We discuss the architecture of this configurable machine, and focus on its use on Monte Carlo simulations of statistical mechanics. On this class of application JANUS achieves impressive performances: in some cases one JANUS processing element outperfoms high-end PCs by a factor ~ 1000. We also discuss the role of JANUS on other classes of scientific applications.

هندسة العتاد

High-throughput GPU layered decoder of multi-edge type low density parity check codes in continuous-variable quantum key distribution systems

56 - Yang Li , Xiaofang Zhang , Yong Li 2020

The decoding throughput in the postprocessing is one of the bottlenecks for a continuous-variable quantum key distribution (CV-QKD) system. In this paper, we propose a layered decoder to decode quasi-cyclic multi-edge type LDPC (QC-METLDPC) codes bas ed on graphic processing unit (GPU) in continuous-variable quantum key distribution (CV-QKD) systems. We optimize the storage method of the parity check matrix, merge the sub-matrices which are unrelated, and decode multiple codewords in parallel on GPU. Simulation results demonstrate that the average decoding speed of LDPC codes with three typical code rates, i.e., 0.1, 0.05 and 0.02, is up to 64.11Mbits/s, 48.65Mbits/s and 39.51Mbits/s, respectively, when decoding 128 codewords of length 106 simultaneously without early termination.

فيزياء الكم نظرية المعلومات نظرية المعلومات

CIAO: Cache Interference-Aware Throughput-Oriented Architecture and Scheduling for GPUs

94 - Jie Zhang , Shuwen Gao , Nam Sung Kim 2018

A modern GPU aims to simultaneously execute more warps for higher Thread-Level Parallelism (TLP) and performance. When generating many memory requests, however, warps contend for limited cache space and thrash cache, which in turn severely degrades p erformance. To reduce such cache thrashing, we may adopt cache locality-aware warp scheduling which gives higher execution priority to warps with higher potential of data locality. However, we observe that warps with high potential of data locality often incurs far more cache thrashing or interference than warps with low potential of data locality. Consequently, cache locality-aware warp scheduling may undesirably increase cache interference and/or unnecessarily decrease TLP. In this paper, we propose Cache Interference-Aware throughput-Oriented (CIAO) on-chip memory architecture and warp scheduling which exploit unused shared memory space and take insight opposite to cache locality-aware warp scheduling. Specifically, CIAO on-chip memory architecture can adaptively redirect memory requests of severely interfering warps to unused shared memory space to isolate memory requests of these interfering warps from those of interfered warps. If these interfering warps still incur severe cache interference, CIAO warp scheduling then begins to selectively throttle execution of these interfering warps. Our experiment shows that CIAO can offer 54% higher performance than prior cache locality-aware scheduling at a small chip cost.

هندسة العتاد

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة قرطبة الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Strategies for High-Throughput FPGA-based QC-LDPC Decoder Architecture

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً