No Arabic abstract
Machine learning techniques allow a direct mapping of atomic positions and nuclear charges to the potential energy surface with almost ab-initio accuracy and the computational efficiency of empirical potentials. In this work we propose a machine learning method for constructing high-dimensional potential energy surfaces based on feed-forward neural networks. As input to the neural network we propose an extendable invariant local molecular descriptor constructed from geometric moments. Their formulation via pairwise distance vectors and tensor contractions allows a very efficient implementation on graphical processing units (GPUs). The atomic species is encoded in the molecular descriptor, which allows the restriction to one neural network for the training of all atomic species in the data set. We demonstrate that the accuracy of the developed approach in representing both chemical and configurational spaces is comparable to the one of several established machine learning models. Due to its high accuracy and efficiency, the proposed machine-learned potentials can be used for any further tasks, for example the optimization of molecular geometries, the calculation of rate constants or molecular dynamics.
We propose a simple, but efficient and accurate machine learning (ML) model for developing high-dimensional potential energy surface. This so-called embedded atom neural network (EANN) approach is inspired by the well-known empirical embedded atom method (EAM) model used in condensed phase. It simply replaces the scalar embedded atom density in EAM with a Gaussian-type orbital based density vector, and represents the complex relationship between the embedded density vector and atomic energy by neural networks. We demonstrate that the EANN approach is equally accurate as several established ML models in representing both big molecular and extended periodic systems, yet with much fewer parameters and configurations. It is highly efficient as it implicitly contains the three-body information without an explicit sum of the conventional costly angular descriptors. With high accuracy and efficiency, EANN potentials can vastly accelerate molecular dynamics and spectroscopic simulations in complex systems at ab initio level.
Abstract Machine learning models, trained on data from ab initio quantum simulations, are yielding molecular dynamics potentials with unprecedented accuracy. One limiting factor is the quantity of available training data, which can be expensive to obtain. A quantum simulation often provides all atomic forces, in addition to the total energy of the system. These forces provide much more information than the energy alone. It may appear that training a model to this large quantity of force data would introduce significant computational costs. Actually, training to all available force data should only be a few times more expensive than training to energies alone. Here, we present a new algorithm for efficient force training, and benchmark its accuracy by training to forces from real-world datasets for organic chemistry and bulk aluminum.
Machine learning of atomic-scale properties is revolutionizing molecular modelling, making it possible to evaluate inter-atomic potentials with first-principles accuracy, at a fraction of the costs. The accuracy, speed and reliability of machine-learning potentials, however, depends strongly on the way atomic configurations are represented, i.e. the choice of descriptors used as input for the machine learning method. The raw Cartesian coordinates are typically transformed in fingerprints, or symmetry functions, that are designed to encode, in addition to the structure, important properties of the potential-energy surface like its invariances with respect to rotation, translation and permutation of like atoms. Here we discuss automatic protocols to select a number of fingerprints out of a large pool of candidates, based on the correlations that are intrinsic to the training data. This procedure can greatly simplify the construction of neural network potentials that strike the best balance between accuracy and computational efficiency, and has the potential to accelerate by orders of magnitude the evaluation of Gaussian Approximation Potentials based on the Smooth Overlap of Atomic Positions kernel. We present applications to the construction of neural network potentials for water and for an Al-Mg-Si alloy, and to the prediction of the formation energies of small organic molecules using Gaussian process regression.
In data processing and machine learning, an important challenge is to recover and exploit models that can represent accurately the data. We consider the problem of recovering Gaussian mixture models from datasets. We investigate symmetric tensor decomposition methods for tackling this problem, where the tensor is built from empirical moments of the data distribution. We consider identifiable tensors, which have a unique decomposition, showing that moment tensors built from spherical Gaussian mixtures have this property. We prove that symmetric tensors with interpolation degree strictly less than half their order are identifiable and we present an algorithm, based on simple linear algebra operations, to compute their decomposition. Illustrative experimentations show the impact of the tensor decomposition method for recovering Gaussian mixtures, in comparison with other state-of-the-art approaches.
Quantum simulators and processors are rapidly improving nowadays, but they are still not able to solve complex and multidimensional tasks of practical value. However, certain numerical algorithms inspired by the physics of real quantum devices prove to be efficient in application to specific problems, related, for example, to combinatorial optimization. Here we implement a numerical annealer based on simulating the coherent Ising machine as a tool to sample from a high-dimensional Boltzmann probability distribution with the energy functional defined by the classical Ising Hamiltonian. Samples provided by such a generator are then utilized for the partition function estimation of this distribution and for the training of a general Boltzmann machine. Our study opens up a door to practical application of numerical quantum-inspired annealers.