No Arabic abstract
Due to their very nature, Spin Waves (SWs) created in the same waveguide, but with different frequencies, can coexist while selectively interacting with their own species only. The absence of inter-frequency interferences isolates input data sets encoded in SWs with different frequencies and creates the premises for simultaneous data parallel SW based processing without hardware replication or delay overhead. In this paper we leverage this SW property by introducing a novel computation paradigm, which allows for the parallel processing of n-bit input data vectors on the same basic SW based logic gate. Subsequently, to demonstrate the proposed concept, we present 8-bit parallel 3-input Majority gate implementation and validate it by means of Object Oriented MicroMagnetic Framework (OOMMF) simulations. To evaluate the potential benefit of our proposal we compare the 8-bit data parallel gate with equivalent scalar SW gate based implementation. Our evaluation indicates that 8-bit data 3-input Majority gate implementation requires 4.16x less area than the scalar SW gate based equivalent counterpart while preserving the same delay and energy consumption figures.
An electric current controlled spin-wave logic gate based on a width-modulated dynamic magnonic crystal is realized. The device utilizes a spin-wave waveguide fabricated from a single-crystal Yttrium Iron Garnet film and two conducting wires attached to the film surface. Application of electric currents to the wires provides a means for dynamic control of the effective geometry of the waveguide and results in a suppression of the magnonic band gap. The performance of the magnonic crystal as an AND logic gate is demonstrated.
Quantum computation requires qubits that can be coupled and realized in a scalable manner, together with universal and high-fidelity one- and two-qubit logic gates cite{DiVincenzo2000, Loss1998}. Strong effort across several fields have led to an impressive array of qubit realizations, including trapped ions cite{Brown2011}, superconducting circuits cite{Barends2014}, single photonscite{Kok2007}, single defects or atoms in diamond cite{Waldherr2014, Dolde2014} and silicon cite{Muhonen2014}, and semiconductor quantum dots cite{Veldhorst2014}, all with single qubit fidelities exceeding the stringent thresholds required for fault-tolerant quantum computing cite{Fowler2012}. Despite this, high-fidelity two-qubit gates in the solid-state that can be manufactured using standard lithographic techniques have so far been limited to superconducting qubits cite{Barends2014}, as semiconductor systems have suffered from difficulties in coupling qubits and dephasing cite{Nowack2011, Brunner2011, Shulman2012}. Here, we show that these issues can be eliminated altogether using single spins in isotopically enriched siliconcite{Itoh2014} by demonstrating single- and two-qubit operations in a quantum dot system using the exchange interaction, as envisaged in the original Loss-DiVincenzo proposal cite{Loss1998}. We realize CNOT gates via either controlled rotation (CROT) or controlled phase (CZ) operations combined with single-qubit operations. Direct gate-voltage control provides single-qubit addressability, together with a switchable exchange interaction that is employed in the two-qubit CZ gate. The speed of the two-qubit CZ operations is controlled electrically via the detuning energy and we find that over 100 two-qubit gates can be performed within a two-qubit coherence time of 8 textmu s, thereby satisfying the criteria required for scalable quantum computation.
By their very nature, voltage/current excited Spin Waves (SWs) propagate through waveguides without consuming noticeable power. If SW excitation is performed by the continuous application of voltages/currents to the input, which is usually the case, the overall energy consumption is determined by the transducer power and the circuit critical path delay, which leads to high energy consumption because of SWs slowness. However, if transducers are operated in pulses the energy becomes circuit delay independent and it is mainly determined by the transducer power and delay, thus pulse operation should be targeted. In this paper, we utilize a 3-input Majority gate (MAJ) to investigate the Continuous Mode Operation (CMO), and Pulse Mode Operation (PMO). Moreover, we validate CMO and PMO 3-input Majority gate by means of micromagnetic simulations. Furthermore, we evaluate and compare the CMO and PMO Majority gate implementations in term of energy. The results indicate that PMO diminishes MAJ gate energy consumption by a factor of 18. In addition, we describe how PMO can open the road towards the utilization of the Wave Pipelining (WP) concept in SW circuits. We validate the WP concept by means of micromagnetic simulations and we evaluate its implications in term of throughput. Our evaluation indicates that for a circuit formed by four cascaded MAJ gates WP increases the throughput by 3.6x.
We have designed and tested a parallel 8-bit ERSFQ arithmetic logic unit (ALU). The ALU design employs wave-pipelined instruction execution and features modular bit-slice architecture that is easily extendable to any number of bits and adaptable to current recycling. A carry signal synchronized with an asynchronous instruction propagation provides the wave-pipeline operation of the ALU. The ALU instruction set consists of 14 arithmetical and logical instructions. It has been designed and simulated for operation up to a 10 GHz clock rate at the 10-kA/cm2 fabrication process. The ALU is embedded into a shift-register-based high-frequency testbed with on-chip clock generator to allow for comprehensive high frequency testing for all possible operands. The 8-bit ERSFQ ALU, comprising 6840 Josephson junctions, has been fabricated with MIT Lincoln Lab 10-kA/cm2 SFQ5ee fabrication process featuring eight Nb wiring layers and a high-kinetic inductance layer needed for ERSFQ technology. We evaluated the bias margins for all instructions and various operands at both low and high frequency clock. At low frequency, clock and all instruction propagation through ALU were observed with bias margins of +/-11% and +/-9%, respectively. Also at low speed, the ALU exhibited correct functionality for all arithmetical and logical instructions with +/-6% bias margins. We tested the 8-bit ALU for all instructions up to 2.8 GHz clock frequency.
Spin Waves(SWs) enable the realization of energy efficient circuits as they propagate and interfere within waveguides without consuming noticeable energy. However, SW computing can be even more energy efficient by taking advantage of the approximate computing paradigm as many applications are error-tolerant like multimedia and social media. In this paper we propose an ultra-low energy novel Approximate Full Adder(AFA) and a 2-bit inputs Multiplier(AMUL). We validate the correct functionality of our proposal by means of micromagnetic simulations and evaluate the approximate FA figure of merit against state-of-the-art accurate SW, 7nmCMOS, Spin Hall Effect(SHE), Domain Wall Motion(DWM), accurate and approximate 45nmCMOS, Magnetic Tunnel Junction(MTJ), and Spin-CMOS FA implementations. Our results indicate that AFA consumes 43% and 33% less energy than state-of-the-art accurate SW and 7nmCMOS FA, respectively, and saves 69% and 44% when compared with accurate and approximate 45nm CMOS, respectively, and provides a 2 orders of magnitude energy reduction when compared with accurate SHE, accurate and approximate DWM, MTJ, and Spin-CMOS, counterparts. In addition, it achieves the same error rate as approximate 45nmCMOS and Spin-CMOS FA whereas it exhibits 50% less error rate than the approximate DWM FA. Furthermore, it outperforms its contenders in terms of area by saving at least 29% chip real-estate. AMUL is evaluated and compared with state-of-the-art accurate SW and 16nm CMOS accurate and approximate state-of-the-art designs. The evaluation results indicate that it saves at least 2x and 5x energy in comparison with the state-of-the-art SW designs and 16nm CMOS accurate and approximate designs, respectively, and has an average error rate of 10%, while the approximate CMOS MUL has an average error rate of 13%, and requires at least 64% less chip real-estate.