NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs

231 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Alessandro Lonardo

تاريخ النشر 2013

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف R. Ammendola - A. Biagioni - O. Frezza

أجهزة الكشف الفيزيائية النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

NaNet is an FPGA-based PCIe X8 Gen2 NIC supporting 1/10 GbE links and the custom 34 Gbps APElink channel. The design has GPUDirect RDMA capabilities and features a network stack protocol offloading module, making it suitable for building low-latency, real-time GPU-based computing systems. We provide a detailed description of the NaNet hardware modular architecture. Benchmarks for latency and bandwidth for GbE and APElink channels are presented, followed by a performance analysis on the case study of the GPU-based low level trigger for the RICH detector in the NA62 CERN experiment, using either the NaNet GbE and APElink channels. Finally, we give an outline of project future activities.

قيم البحث

118 - Roberto Ammendola , Andrea Biagioni , Riccardo Fantechi 2013

We implemented the NaNet FPGA-based PCI2 Gen2 GbE/APElink NIC, featuring GPUDirect RDMA capabilities and UDP protocol management offloading. NaNet is able to receive a UDP input data stream from its GbE interface and redirect it, without any intermed iate buffering or CPU intervention, to the memory of a Fermi/Kepler GPU hosted on the same PCIe bus, provided that the two devices share the same upstream root complex. Synthetic benchmarks for latency and bandwidth are presented. We describe how NaNet can be employed in the prototype of the GPU-based RICH low-level trigger processor of the NA62 CERN experiment, to implement the data link between the TEL62 readout boards and the low level trigger processor. Results for the throughput and latency of the integrated system are presented and discussed.

أجهزة الكشف الفيزيائية

NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features

126 - A. Lonardo , F. Ameli , R. Ammendola 2014

While the GPGPU paradigm is widely recognized as an effective approach to high performance computing, its adoption in low-latency, real-time systems is still in its early stages. Although GPUs typically show deterministic behaviour in terms of late ncy in executing computational kernels as soon as data is available in their internal memories, assessment of real-time features of a standard GPGPU system needs careful characterization of all subsystems along data stream path. The networking subsystem results in being the most critical one in terms of absolute value and fluctuations of its response latency. Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network Interface Card (NIC) design featuring a configurable and extensible set of network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler GPU memories. NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE (10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency KM3link - channels, but its modularity allows for a straightforward inclusion of other link technologies. To avoid host OS intervention on data stream and remove a possible source of jitter, the design includes a network/transport layer offload module with cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division Multiplexing and APElink protocols. After NaNet architecture description and its latency/bandwidth characterization for all supported links, two real world use cases will be presented: the GPU-based low level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino telescope.

أجهزة الكشف الفيزيائية هندسة العتاد

Real-time Control System Prototype for Mechanical and Optical Systems based on Parallel Computing Techniques

71 - F. Acernese , F. Barone , R. De Rosa 2003

Real-time control systems often require dedicated hardware and software, including real-time operating systems, while many systems are available for off-line computing, mainly based on standard system units (PCs), standard network connections (Ethern et), standard operating systems (Linux) and software independent from the particular architecture of the single unit. In order to try to get the advantages of both the technologies, we built an hybrid control system prototype using network based parallel computing architecture within real-time control system. In this paper we describe the architecture of the implemented system, the preliminary tests we performed for its characterization and the architecture of the control system we used for the real-time control tests.

أجهزة الكشف الفيزيائية

LOCx2, a Low-latency, Low-overhead, 2 x 5.12-Gbps Transmitter ASIC for the ATLAS Liquid Argon Calorimeter Trigger Upgrade

80 - Le Xiao , Xiaoting Li , Datao Gong 2020

In this paper, we present the design and test results of LOCx2, a transmitter ASIC for the ATLAS Liquid Argon Calorimeter trigger upgrade. LOCx2 consists of two channels and each channel encodes ADC data with an overhead of 14.3% and transmits serial data at 5.12 Gbps with a latency of less than 27.2 ns. LOCx2 is fabricated with a commercial 0.25-um Silicon-on-Sapphire CMOS technology and is packaged in a 100-pin QFN package. The power consumption of LOCx2 is about 843 mW.

أجهزة الكشف الفيزيائية

Cloud based Real-Time and Low Latency Scientific Event Analysis

259 - Chen Yang , Xiaofeng Meng , Zhihui Du 2018

Astronomy is well recognized as big data driven science. As the novel observation infrastructures are developed, the sky survey cycles have been shortened from a few days to a few seconds, causing data processing pressure to shift from offline to onl ine. However, existing scientific databases focus on offline analysis of long-term historical data, not real-time and low latency analysis of large-scale newly arriving data. In this paper, a cloud based method is proposed to efficiently analyze scientific events on large-scale newly arriving data. The solution is implemented as a highly efficient system, namely Aserv. A set of compact data store and index structures are proposed to describe the proposed scientific events and a typical analysis pattern is formulized as a set of query operations. Domain aware filter, accuracy aware data partition, highly efficient index and frequently used statistical data designs are four key methods to optimize the performance of Aserv. Experimental results under the typical cloud environment show that the presented optimization mechanism can meet the low latency demand for both large data insertion and scientific event analysis. Aserv can insert 3.5 million rows of data within 3 seconds and perform the heaviest query on 6.7 billion rows of data also within 3 seconds. Furthermore, a performance model is given to help Aserv choose the right cloud resource setup to meet the guaranteed real-time performance requirement.

قواعد البيانات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الافتراضية السورية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

NaNet: a flexible and configurable low-latency NIC for real-time trigger systems based on GPUs

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً