Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Simulating spin systems on IANUS, an FPGA-based computer

441 0 0.0 ( 0 )

Download Cite

Added by Andrea Maiorano

Publication date 2007

fields Physics Informatics Engineering

and research's language is English

Authors F. Belletti - M. Cotallo - A. Cruz

Disordered Systems and Neural Networks Hardware Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some selected spin models. We describe here codes that we are currently executing on the IANUS massively parallel FPGA-based system.

rate research

Ianus: an Adpative FPGA Computer

111 - Ianus Collaboration: F. Belletti , I. Campos , A. Cruz 2005

Dedicated machines designed for specific computational algorithms can outperform conventional computers by several orders of magnitude. In this note we describe {it Ianus}, a new generation FPGA based machine and its basic features: hardware integration and wide reprogrammability. Our goal is to build a machine that can fully exploit the performance potential of new generation FPGA devices. We also plan a software platform which simplifies its programming, in order to extend its intended range of application to a wide class of interesting and computationally demanding problems. The decision to develop a dedicated processor is a complex one, involving careful assessment of its performance lead, during its expected lifetime, over traditional computers, taking into account their performance increase, as predicted by Moores law. We discuss this point in detail.

Disordered Systems and Neural Networks Other Condensed Matter Computational Physics

Simulating Spin Waves in Entropy Stabilized Oxides

75 - Tom Berlijn , Gonzalo Alvarez , David S. Parker 2020

The entropy stabilized oxide Mg$_{0.2}$Co$_{0.2}$Ni$_{0.2}$Cu$_{0.2}$Zn$_{0.2}$O exhibits antiferromagnetic order and magnetic excitations, as revealed by recent neutron scattering experiments. This observation raises the question of the nature of spin wave excitations in such disordered systems. Here, we investigate theoretically the magnetic ground state and the spin-wave excitations using linear spin-wave theory in combination with the supercell approximation to take into account the extreme disorder in this magnetic system. We find that the experimentally observed antiferromagnetic structure can be stabilized by a rhombohedral distortion together with large second nearest neighbor interactions. Our calculations show that the spin-wave spectrum consists of a well-defined low-energy coherent spectrum in the background of an incoherent continuum that extends to higher energies.

Disordered Systems and Neural Networks Strongly Correlated Electrons

unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation

413 - Stylianos I. Venieris , Javier Fernandez-Marques , Nicholas D. Lane 2021

Single computation engines have become a popular design choice for FPGA-based convolutional neural networks (CNNs) enabling the deployment of diverse models without fabric reconfiguration. This flexibility, however, often comes with significantly reduced performance on memory-bound layers and resource underutilisation due to suboptimal mapping of certain layers on the engines fixed configuration. In this work, we investigate the implications in terms of CNN engine design for a class of models that introduce a pre-convolution stage to decompress the weights at run time. We refer to these approaches as on-the-fly. To minimise the negative impact of limited bandwidth on memory-bound layers, we present a novel hardware component that enables the on-chip on-the-fly generation of weights. We further introduce an input selective processing element (PE) design that balances the load between PEs on suboptimally mapped layers. Finally, we present unzipFPGA, a framework to train on-the-fly models and traverse the design space to select the highest performing CNN engine configuration. Quantitative evaluation shows that unzipFPGA yields an average speedup of 2.14x and 71% over optimised status-quo and pruned CNN engines under constrained bandwidth and up to 3.69x higher performance density over the state-of-the-art FPGA-based CNN accelerators.

Computer Vision and Pattern Recognition Hardware Architecture Machine Learning

QPACE -- a QCD parallel computer based on Cell processors

477 - H. Baier , H. Boettiger , M. Drochner 2009

QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. To achieve the very high packaging density of 26 TFlops per rack a new water cooling concept has been developed and successfully realized. In this paper we give an overview of the architecture and highlight some important technical details of the system. Furthermore, we provide initial performance results and report on the installation of 8 QPACE racks providing an aggregate peak performance of 200 TFlops.

High Energy Physics - Lattice Hardware Architecture

FPGA-based Binocular Image Feature Extraction and Matching System

71 - Qi Ni , Fei Wang , Ziwei Zhao 2019

Image feature extraction and matching is a fundamental but computation intensive task in machine vision. This paper proposes a novel FPGA-based embedded system to accelerate feature extraction and matching. It implements SURF feature point detection and BRIEF feature descriptor construction and matching. For binocular stereo vision, feature matching includes both tracking matching and stereo matching, which simultaneously provide feature point correspondences and parallax information. Our system is evaluated on a ZYNQ XC7Z045 FPGA. The result demonstrates that it can process binocular video data at a high frame rate (640$times$480 @ 162fps). Moreover, an extensive test proves our system has robustness for image compression, blurring and illumination.

Computer Vision and Pattern Recognition Hardware Architecture Multimedia

comments

Fetching comments

Syrian International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Simulating spin systems on IANUS, an FPGA-based computer

Ask ChatGPT about the research

No Arabic abstract

Read More