ترغب بنشر مسار تعليمي؟ اضغط هنا

We introduce the IBM Analog Hardware Acceleration Kit, a new and first of a kind open source toolkit to simulate analog crossbar arrays in a convenient fashion from within PyTorch (freely available at https://github.com/IBM/aihwkit). The toolkit is u nder active development and is centered around the concept of an analog tile which captures the computations performed on a crossbar array. Analog tiles are building blocks that can be used to extend existing network modules with analog components and compose arbitrary artificial neural networks (ANNs) using the flexibility of the PyTorch framework. Analog tiles can be conveniently configured to emulate a plethora of different analog hardware characteristics and their non-idealities, such as device-to-device and cycle-to-cycle variations, resistive device response curves, and weight and output noise. Additionally, the toolkit makes it possible to design custom unit cell configurations and to use advanced analog optimization algorithms such as Tiki-Taka. Moreover, the backward and update behavior can be set to ideal to enable hardware-aware training features for chips that target inference acceleration only. To evaluate the inference accuracy of such chips over time, we provide statistical programming noise and drift models calibrated on phase-change memory hardware. Our new toolkit is fully GPU accelerated and can be used to conveniently estimate the impact of material properties and non-idealities of future analog technology on the accuracy for arbitrary ANNs.
A resistive memory device-based computing architecture is one of the promising platforms for energy-efficient Deep Neural Network (DNN) training accelerators. The key technical challenge in realizing such accelerators is to accumulate the gradient in formation without a bias. Unlike the digital numbers in software which can be assigned and accessed with desired accuracy, numbers stored in resistive memory devices can only be manipulated following the physics of the device, which can significantly limit the training performance. Therefore, additional techniques and algorithm-level remedies are required to achieve the best possible performance in resistive memory device-based accelerators. In this paper, we analyze asymmetric conductance modulation characteristics in RRAM by Soft-bound synapse model and present an in-depth analysis on the relationship between device characteristics and DNN model accuracy using a 3-layer DNN trained on the MNIST dataset. We show that the imbalance between up and down update leads to a poor network performance. We introduce a concept of symmetry point and propose a zero-shifting technique which can compensate imbalance by programming the reference device and changing the zero value point of the weight. By using this zero-shifting method, we show that network performance dramatically improves for imbalanced synapse devices.
In a previous work we have detailed the requirements to obtain a maximal performance benefit by implementing fully connected deep neural networks (DNN) in form of arrays of resistive devices for deep learning. This concept of Resistive Processing Uni t (RPU) devices we extend here towards convolutional neural networks (CNNs). We show how to map the convolutional layers to RPU arrays such that the parallelism of the hardware can be fully utilized in all three cycles of the backpropagation algorithm. We find that the noise and bound limitations imposed due to analog nature of the computations performed on the arrays effect the training accuracy of the CNNs. Noise and bound management techniques are presented that mitigate these problems without introducing any additional complexity in the analog circuits and can be addressed by the digital circuits. In addition, we discuss digitally programmable update management and device variability reduction techniques that can be used selectively for some of the layers in a CNN. We show that combination of all those techniques enables a successful application of the RPU concept for training CNNs. The techniques discussed here are more general and can be applied beyond CNN architectures and therefore enables applicability of RPU approach for large class of neural network architectures.
377 - Oki Gunawan , Tayfun Gokmen , 2014
Low open circuit voltage ($V_{OC}$) has been recognized as the number one problem in the current generation of Cu$_{2}$ZnSn(Se,S)$_{4}$ (CZTSSe) solar cells. We report high light intensity and low temperature Suns-$V_{OC}$ measurement in high perform ance CZTSSe devices. The Suns-$V_{OC}$ curves exhibit bending at high light intensity, which points to several prospective $V_{OC}$ limiting mechanisms that could impact the $V_{OC}$, even at 1 sun for lower performing samples. These V$_{OC}$ limiting mechanisms include low bulk conductivity (because of low hole density or low mobility), bulk or interface defects including tail states, and a non-ohmic back contact for low carrier density CZTSSe. The non-ohmic back contact problem can be detected by Suns-$V_{OC}$ measurements with different monochromatic illumination. These limiting factors may also contribute to an artificially lower $J_{SC}$-$V_{OC}$ diode ideality factor.
In an ideal two-component two-dimensional electron system, particle-hole symmetry dictates that the fractional quantum Hall states around $ u = 1/2$ are equivalent to those around $ u = 3/2$. We demonstrate that composite fermions (CFs) around $ u = 1/2$ in AlAs possess a valley degree of freedom like their counterparts around $ u = 3/2$. However, focusing on $ u = 2/3$ and 4/3, we find that the energy needed to completely valley polarize the CFs around $ u = 1/2$ is considerably smaller than the corresponding value for CFs around $ u = 3/2$ thus betraying a particle-hole symmetry breaking.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا