ترغب بنشر مسار تعليمي؟ اضغط هنا

Atom-Density Representations for Machine Learning

112   0   0.0 ( 0 )
 نشر من قبل Michele Ceriotti
 تاريخ النشر 2018
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

The applications of machine learning techniques to chemistry and materials science become more numerous by the day. The main challenge is to devise representations of atomic systems that are at the same time complete and concise, so as to reduce the number of reference calculations that are needed to predict the properties of different types of materials reliably. This has led to a proliferation of alternative ways to convert an atomic structure into an input for a machine-learning model. We introduce an abstract definition of chemical environments that is based on a smoothed atomic density, using a bra-ket notation to emphasize basis set independence and to highlight the connections with some popular choices of representations for describing atomic systems. The correlations between the spatial distribution of atoms and their chemical identities are computed as inner products between these feature kets, which can be given an explicit representation in terms of the expansion of the atom density on orthogonal basis functions, that is equivalent to the smooth overlap of atomic positions (SOAP) power spectrum, but also in real space, corresponding to $n$-body correlations of the atom density. This formalism lays the foundations for a more systematic tuning of the behavior of the representations, by introducing operators that represent the correlations between structure, composition, and the target properties. It provides a unifying picture of recent developments in the field and indicates a way forward towards more effective and computationally affordable machine-learning schemes for molecules and materials.



قيم البحث

اقرأ أيضاً

Physically-motivated and mathematically robust atom-centred representations of molecular structures are key to the success of modern atomistic machine learning (ML) methods. They lie at the foundation of a wide range of methods to predict the propert ies of both materials and molecules as well as to explore and visualize the chemical compound and configuration space. Recently, it has become clear that many of the most effective representations share a fundamental formal connection: that they can all be expressed as a discretization of N-body correlation functions of the local atom density, suggesting the opportunity of standardizing and, more importantly, optimizing the calculation of such representations. We present an implementation, named librascal, whose modular design lends itself both to developing refinements to the density-based formalism and to rapid prototyping for new developments of rotationally equivariant atomistic representations. As an example, we discuss SOAP features, perhaps the most widely used member of this family of representations, to show how the expansion of the local density can be optimized for any choice of radial basis set. We discuss the representation in the context of a kernel ridge regression model, commonly used with SOAP features, and analyze how the computational effort scales for each of the individual steps of the calculation. By applying data reduction techniques in feature space, we show how to further reduce the total computational cost by at up to a factor of 4 or 5 without affecting the models symmetry properties and without significantly impacting its accuracy.
170 - Yaolong Zhang , Ce Hu , Bin Jiang 2019
We propose a simple, but efficient and accurate machine learning (ML) model for developing high-dimensional potential energy surface. This so-called embedded atom neural network (EANN) approach is inspired by the well-known empirical embedded atom me thod (EAM) model used in condensed phase. It simply replaces the scalar embedded atom density in EAM with a Gaussian-type orbital based density vector, and represents the complex relationship between the embedded density vector and atomic energy by neural networks. We demonstrate that the EANN approach is equally accurate as several established ML models in representing both big molecular and extended periodic systems, yet with much fewer parameters and configurations. It is highly efficient as it implicitly contains the three-body information without an explicit sum of the conventional costly angular descriptors. With high accuracy and efficiency, EANN potentials can vastly accelerate molecular dynamics and spectroscopic simulations in complex systems at ab initio level.
68 - J. P. Coe 2019
The concept of machine learning configuration interaction (MLCI) [J. Chem. Theory Comput. 2018, 14, 5739], where an artificial neural network (ANN) learns on the fly to select important configurations, is further developed so that accurate ab initio potential energy curves can be efficiently calculated. This development includes employing the artificial neural network also as a hash function for the efficient deletion of duplicates on the fly so that the singles and doubles space does not need to be stored and this barrier to scalability is removed. In addition configuration state functions are introduced into the approach so that pure spin states are guaranteed, and the transferability of data between geometries is exploited. This improved approach is demonstrated on potential energy curves for the nitrogen molecule, water, and carbon monoxide. The results are compared with full configuration interaction values, when available, and different transfer protocols are investigated. It is shown that, for all of the considered systems, accurate potential energy curves can now be efficiently computed with MLCI. For the potential curves of N$_{2}$ and CO, MLCI can achieve lower errors than stochastically selecting configurations while also using substantially less processor hours.
Statistical learning algorithms are finding more and more applications in science and technology. Atomic-scale modeling is no exception, with machine learning becoming commonplace as a tool to predict energy, forces and properties of molecules and co ndensed-phase systems. This short review summarizes recent progress in the field, focusing in particular on the problem of representing an atomic configuration in a mathematically robust and computationally efficient way. We also discuss some of the regression algorithms that have been used to construct surrogate models of atomic-scale properties. We then show examples of how the optimization of the machine-learning models can both incorporate and reveal insights onto the physical phenomena that underlie structure-property relations.
Two types of approaches to modeling molecular systems have demonstrated high practical efficiency. Density functional theory (DFT), the most widely used quantum chemical method, is a physical approach predicting energies and electron densities of mol ecules. Recently, numerous papers on machine learning (ML) of molecular properties have also been published. ML models greatly outperform DFT in terms of computational costs, and may even reach comparable accuracy, but they are missing physicality - a direct link to Quantum Physics - which limits their applicability. Here, we propose an approach that combines the strong sides of DFT and ML, namely, physicality and low computational cost. By generalizing the famous Hohenberg-Kohn theorems, we derive general equations for exact electron densities and energies that can naturally guide applications of ML in Quantum Chemistry. Based on these equations, we build a deep neural network that can compute electron densities and energies of a wide range of organic molecules not only much faster, but also closer to exact physical values than curre
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا