No Arabic abstract
This work presents a novel general compact model for 7nm technology node devices like FinFETs. As an extension of previous conventional compact model that based on some less accurate elements including one-dimensional Poisson equation for three-dimensional devices and analytical equations for short channel effects, quantum effects and other physical effects, the general compact model combining few TCAD calibrated compact models with statistical methods can eliminate the tedious physical derivations. The general compact model has the advantages of efficient extraction, high accuracy, strong scaling capability and excellent transfer capability. As a demo application, two key design knobs of FinFET and their multiple impacts on RC control ESD power clamp circuit are systematically evaluated with implementation of the newly proposed general compact model, accounting for device design, circuit performance optimization and variation control. The performance of ESD power clamp can be improved extremely. This framework is also suitable for pathfinding researches on 5nm node gate-all-around devices, like nanowire (NW) FETs, nanosheet (NSH) FETs and beyond.
A Boltzmann machine whose effective temperature can be dynamically cooled provides a stochastic neural network realization of simulated annealing, which is an important metaheuristic for solving combinatorial or global optimization problems with broad applications in machine intelligence and operations research. However, the hardware realization of the Boltzmann stochastic element with cooling capability has never been achieved within an individual semiconductor device. Here we demonstrate a new memristive device concept based on two-dimensional material heterostructures that enables this critical stochastic element in a Boltzmann machine. The dynamic cooling effect in simulated annealing can be emulated in this multi-terminal memristive device through electrostatic bias with sigmoidal thresholding distributions. We also show that a machine-learning-based method is efficient for device-circuit co-design of the Boltzmann machine based on the stochastic memristor devices in simulated annealing. The experimental demonstrations of the tunable stochastic memristors combined with the machine-learning-based device-circuit co-optimization approach for stochastic-memristor-based neural-network circuits chart a pathway for the efficient hardware realization of stochastic neural networks with applications in a broad range of electronics and computing disciplines.
This paper presents a physics-based modeling framework for the analysis and transient simulation of circuits containing Spin-Transfer Torque (STT) Magnetic Tunnel Junction (MTJ) devices. The framework provides the tools to analyze the stochastic behavior of MTJs and to generate Verilog-A compact models for their simulation in large VLSI designs, addressing the need for an industry-ready model accounting for real-world reliability and scalability requirements. Device dynamics are described by the Landau-Lifshitz-Gilbert-Slonczewsky (s-LLGS ) stochastic magnetization considering Voltage-Controlled Magnetic Anisotropy (VCMA) and the non-negligible statistical effects caused by thermal noise. Model behavior is validated against the OOMMF magnetic simulator and its performance is characterized on a 1-Mb 28 nm Magnetoresistive-RAM (MRAM) memory product.
In this study, a SPICE model for negative capacitance vertical nanowire field-effect-transistor (NC VNW-FET) based on BSIM-CMG model and Landau-Khalatnikov (LK) equation was presented. Suffering from the limitation of short gate length there is lack of controllable and integrative structures for high performance NC VNW-FETs. A new kind of structure was proposed for NC VNW-FETs at sub-3nm node. Moreover, in order to understand and improve NC VNW-FETs, the S-shaped polarization-voltage curve (S-curve) was divided into four regions and some new design rules were proposed. By using the SPICE model, device-circuit co-optimization was implemented. The co-design of gate work function (WF) and NC was investigated. A ring oscillator was simulated to analyze the circuit energy-delay, and it shown that significant energy reduction, up to 88%, at iso-delay for NC VNW-FETs at low supply voltage can be achieved. This study gives a credible method to analysis the performance of NC based devices and circuits and reveals the potential of NC VNW-FETs in low-power applications.
We propose a technology-independent method, referred to as adjacent connection matrix (ACM), to efficiently map signed weight matrices to non-negative crossbar arrays. When compared to same-hardware-overhead mapping methods, using ACM leads to improvements of up to 20% in training accuracy for ResNet-20 with the CIFAR-10 dataset when training with 5-bit precision crossbar arrays or lower. When compared with strategies that use two elements to represent a weight, ACM achieves comparable training accuracies, while also offering area and read energy reductions of 2.3x and 7x, respectively. ACM also has a mild regularization effect that improves inference accuracy in crossbar arrays without any retraining or costly device/variation-aware training.
Bi-Level Optimization (BLO) is originated from the area of economic game theory and then introduced into the optimization community. BLO is able to handle problems with a hierarchical structure, involving two levels of optimization tasks, where one task is nested inside the other. In machine learning and computer vision fields, despite the different motivations and mechanisms, a lot of complex problems, such as hyper-parameter optimization, multi-task and meta-learning, neural architecture search, adversarial learning and deep reinforcement learning, actually all contain a series of closely related subproblms. In this paper, we first uniformly express these complex learning and vision problems from the perspective of BLO. Then we construct a best-response-based single-level reformulation and establish a unified algorithmic framework to understand and formulate mainstream gradient-based BLO methodologies, covering aspects ranging from fundamental automatic differentiation schemes to various accelerations, simplifications, extensions and their convergence and complexity properties. Last but not least, we discuss the potentials of our unified BLO framework for designing new algorithms and point out some promising directions for future research.