No Arabic abstract
Machine Learning techniques can be used to represent high-dimensional potential energy surfaces for reactive chemical systems. Two such methods are based on a reproducing kernel Hilbert space representation or on deep neural networks. They can achieve a sub-1 kcal/mol accuracy with respect to reference data and can be used in studies of chemical dynamics. Their construction and a few typical examples are briefly summarized in the present contribution.
Dynamics of flexible molecules are often determined by an interplay between local chemical bond fluctuations and conformational changes driven by long-range electrostatics and van der Waals interactions. This interplay between interactions yields complex potential-energy surfaces (PES) with multiple minima and transition paths between them. In this work, we assess the performance of state-of-the-art Machine Learning (ML) models, namely sGDML, SchNet, GAP/SOAP, and BPNN for reproducing such PES, while using limited amounts of reference data. As a benchmark, we use the cis to trans thermal relaxation in an azobenzene molecule, where at least three different transition mechanisms should be considered. Although GAP/SOAP, SchNet, and sGDML models can globally achieve chemical accuracy of 1 kcal mol-1 with fewer than 1000 training points, predictions greatly depend on the ML method used as well as the local region of the PES being sampled. Within a given ML method, large differences can be found between predictions of close-to-equilibrium and transition regions, as well as for different transition mechanisms. We identify key challenges that the ML models face in learning long-range interactions and the intrinsic limitations of commonly used atom-based descriptors. All in all, our results suggest switching from learning the entire PES within a single model to using multiple local models with optimized descriptors, training sets, and architectures for different parts of complex PES.
The concept of machine learning configuration interaction (MLCI) [J. Chem. Theory Comput. 2018, 14, 5739], where an artificial neural network (ANN) learns on the fly to select important configurations, is further developed so that accurate ab initio potential energy curves can be efficiently calculated. This development includes employing the artificial neural network also as a hash function for the efficient deletion of duplicates on the fly so that the singles and doubles space does not need to be stored and this barrier to scalability is removed. In addition configuration state functions are introduced into the approach so that pure spin states are guaranteed, and the transferability of data between geometries is exploited. This improved approach is demonstrated on potential energy curves for the nitrogen molecule, water, and carbon monoxide. The results are compared with full configuration interaction values, when available, and different transfer protocols are investigated. It is shown that, for all of the considered systems, accurate potential energy curves can now be efficiently computed with MLCI. For the potential curves of N$_{2}$ and CO, MLCI can achieve lower errors than stochastically selecting configurations while also using substantially less processor hours.
We propose a machine-learning method for evaluating the potential barrier governing atomic transport based on the preferential selection of dominant points for the atomic transport. The proposed method generates numerous random samples of the entire potential energy surface (PES) from a probabilistic Gaussian process model of the PES, which enables defining the likelihood of the dominant points. The robustness and efficiency of the method are demonstrated on a dozen model cases for proton diffusion in oxides, in comparison with a conventional nudge elastic band method.
An overview of computational methods to describe high-dimensional potential energy surfaces suitable for atomistic simulations is given. Particular emphasis is put on accuracy, computability, transferability and extensibility of the methods discussed. They include empirical force fields, representations based on reproducing kernels, using permutationally invariant polynomials, and neural network-learned representations and combinations thereof. Future directions and potential improvements are discussed primarily from a practical, application-oriented perspective.
We refine the OrbNet model to accurately predict energy, forces, and other response properties for molecules using a graph neural-network architecture based on features from low-cost approximated quantum operators in the symmetry-adapted atomic orbital basis. The model is end-to-end differentiable due to the derivation of analytic gradients for all electronic structure terms, and is shown to be transferable across chemical space due to the use of domain-specific features. The learning efficiency is improved by incorporating physically motivated constraints on the electronic structure through multi-task learning. The model outperforms existing methods on energy prediction tasks for the QM9 dataset and for molecular geometry optimizations on conformer datasets, at a computational cost that is thousand-fold or more reduced compared to conventional quantum-chemistry calculations (such as density functional theory) that offer similar accuracy.