No Arabic abstract
The Fermionic Neural Network (FermiNet) is a recently-developed neural network architecture that can be used as a wavefunction Ansatz for many-electron systems, and has already demonstrated high accuracy on small systems. Here we present several improvements to the FermiNet that allow us to set new records for speed and accuracy on challenging systems. We find that increasing the size of the network is sufficient to reach chemical accuracy on atoms as large as argon. Through a combination of implementing FermiNet in JAX and simplifying several parts of the network, we are able to reduce the number of GPU hours needed to train the FermiNet on large systems by an order of magnitude. This enables us to run the FermiNet on the challenging transition of bicyclobutane to butadiene and compare against the PauliNet on the automerization of cyclobutadiene, and we achieve results near the state of the art for both.
Accurate numerical solutions for the Schrodinger equation are of utmost importance in quantum chemistry. However, the computational cost of current high-accuracy methods scales poorly with the number of interacting particles. Combining Monte Carlo methods with unsupervised training of neural networks has recently been proposed as a promising approach to overcome the curse of dimensionality in this setting and to obtain accurate wavefunctions for individual molecules at a moderately scaling computational cost. These methods currently do not exploit the regularity exhibited by wavefunctions with respect to their molecular geometries. Inspired by recent successful applications of deep transfer learning in machine translation and computer vision tasks, we attempt to leverage this regularity by introducing a weight-sharing constraint when optimizing neural network-based models for different molecular geometries. That is, we restrict the optimization process such that up to 95 percent of weights in a neural network model are in fact equal across varying molecular geometries. We find that this technique can accelerate optimization when considering sets of nuclear geometries of the same molecule by an order of magnitude and that it opens a promising route towards pre-trained neural network wavefunctions that yield high accuracy even across different molecules.
Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proven that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features upon which to machine learn the force field. In the present contribution, we build upon the advance of Wang et al.and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learns their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.
The spiking neural network (SNN) computes and communicates information through discrete binary events. It is considered more biologically plausible and more energy-efficient than artificial neural networks (ANN) in emerging neuromorphic hardware. However, due to the discontinuous and non-differentiable characteristics, training SNN is a relatively challenging task. Recent work has achieved essential progress on an excellent performance by converting ANN to SNN. Due to the difference in information processing, the converted deep SNN usually suffers serious performance loss and large time delay. In this paper, we analyze the reasons for the performance loss and propose a novel bistable spiking neural network (BSNN) that addresses the problem of spikes of inactivated neurons (SIN) caused by the phase lead and phase lag. Also, when ResNet structure-based ANNs are converted, the information of output neurons is incomplete due to the rapid transmission of the shortcut path. We design synchronous neurons (SN) to help efficiently improve performance. Experimental results show that the proposed method only needs 1/4-1/10 of the time steps compared to previous work to achieve nearly lossless conversion. We demonstrate state-of-the-art ANN-SNN conversion for VGG16, ResNet20, and ResNet34 on challenging datasets including CIFAR-10 (95.16% top-1), CIFAR-100 (78.12% top-1), and ImageNet (72.64% top-1).
Eigenvalue problems are critical to several fields of science and engineering. We present a novel unsupervised neural network for discovering eigenfunctions and eigenvalues for differential eigenvalue problems with solutions that identically satisfy the boundary conditions. A scanning mechanism is embedded allowing the method to find an arbitrary number of solutions. The network optimization is data-free and depends solely on the predictions. The unsupervised method is used to solve the quantum infinite well and quantum oscillator eigenvalue problems.
There has been a wave of interest in applying machine learning to study dynamical systems. In particular, neural networks have been applied to solve the equations of motion, and therefore, track the evolution of a system. In contrast to other applications of neural networks and machine learning, dynamical systems possess invariants such as energy, momentum, and angular momentum, depending on their underlying symmetries. Traditional numerical integration methods sometimes violate these conservation laws, propagating errors in time, ultimately reducing the predictability of the method. We present a data-free Hamiltonian neural network that solves the differential equations that govern dynamical systems. This is an equation-driven unsupervised learning method where the optimization process of the network depends solely on the predicted functions without using any ground truth data. This unsupervised model learns solutions that satisfy identically, up to an arbitrarily small error, Hamiltons equations and, therefore, conserve the Hamiltonian invariants. Once the network is optimized, the proposed architecture is considered a symplectic unit due to the introduction of an efficient parametric form of solutions. In addition, the choice of an appropriate activation function drastically improves the predictability of the network. An error analysis is derived and states that the numerical errors depend on the overall network performance. The symplectic architecture is then employed to solve the equations for the nonlinear oscillator and the chaotic Henon-Heiles dynamical system. In both systems, a symplectic Euler integrator requires two orders more evaluation points than the Hamiltonian network in order to achieve the same order of the numerical error in the predicted phase space trajectories.