ترغب بنشر مسار تعليمي؟ اضغط هنا

Sensitivity and Dimensionality of Atomic Environment Representations used for Machine Learning Interatomic Potentials

79   0   0.0 ( 0 )
 نشر من قبل Berk Onat
 تاريخ النشر 2020
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate: (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations, and over a range of material datasets. Representations investigated include Atom Centred Symmetry Functions, Chebyshev Polynomial Symmetry Functions (CHSF), Smooth Overlap of Atomic Positions, Many-body Tensor Representation and Atomic Cluster Expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations, and that for CHSF there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision, and further that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.

قيم البحث

اقرأ أيضاً

Interatomic potentials (IPs) are reduced-order models for calculating the potential energy of a system of atoms given their positions in space and species. IPs treat atoms as classical particles without explicitly modeling electrons and thus are comp utationally far less expensive than first-principles methods, enabling molecular simulations of significantly larger systems over longer times. Developing an IP is a complex iterative process involving multiple steps: assembling a training set, designing a functional form, optimizing the function parameters, testing model quality, and deployment to molecular simulation packages. This paper introduces the KIM-based learning-integrated fitting framework (KLIFF), a package that facilitates the entire IP development process. KLIFF supports both analytic and machine learning IPs. It adopts a modular approach whereby various components in the fitting process, such as atomic environment descriptors, functional forms, loss functions, optimizers, quality analyzers, and so on, work seamlessly with each other. This provides a flexible framework for the rapid design of new IP forms. Trained IPs are compatible with the Knowledgebase of Interatomic Models (KIM) application programming interface (API) and can be readily used in major materials simulation packages compatible with KIM, including ASE, DL_POLY, GULP, LAMMPS, and QC. KLIFF is written in Python with computationally intensive components implemented in C++. It is parallelized over data and supports both shared-memory multicore desktop machines and high-performance distributed memory computing clusters. We demonstrate the use of KLIFF by fitting an analytic Stillinger--Weber potential and a machine learning neural network potential for silicon. The KLIFF package, together with its documentation, is publicly available at: https://github.com/openkim/kliff.
We introduce a Gaussian approximation potential (GAP) for atomistic simulations of liquid and amorphous elemental carbon. Based on a machine-learning representation of the density-functional theory (DFT) potential-energy surface, such interatomic pot entials enable materials simulations with close-to DFT accuracy but at much lower computational cost. We first determine the maximum accuracy that any finite-range potential can achieve in carbon structures; then, using a novel hierarchical set of two-, three-, and many-body structural descriptors, we construct a GAP model that can indeed reach the target accuracy. The potential yields accurate energetic and structural properties over a wide range of densities; it also correctly captures the structure of the liquid phases, at variance with state-of-the-art empirical potentials. Exemplary applications of the GAP model to surfaces of diamond-like tetrahedral amorphous carbon (ta-C) are presented, including an estimate of the amorphous materials surface energy, and simulations of high-temperature surface reconstructions (graphitization). The new interatomic potential appears to be promising for realistic and accurate simulations of nanoscale amorphous carbon structures.
We propose a novel active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled to Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO2) dataset generated from a melt-quench ab initio molecular dynamics (AIMD) protocol. Our results show that the active learning scheme, with no prior knowledge of the dataset is able to extract a configuration that reaches the required energy fit tolerance. Further, molecular dynamics (MD) simulations performed using this active learned GAP model on 6144-atom systems of amorphous and liquid state elucidate the structural properties of HfO2 with near ab initio precision and quench rates (i.e. 1.0 K/ps) not accessible via AIMD. The melt and amorphous x-ray structural factors generated from our simulation are in good agreement with experiment. Additionally, the calculated diffusion constants are in good agreement with previous ab initio studies.
The universal mathematical form of machine-learning potentials (MLPs) shifts the core of development of interatomic potentials to collecting proper training data. Ideally, the training set should encompass diverse local atomic environments but the co nventional approach is prone to sampling similar configurations repeatedly, mainly due to the Boltzmann statistics. As such, practitioners handpick a large pool of distinct configurations manually, stretching the development period significantly. Herein, we suggest a novel sampling method optimized for gathering diverse yet relevant configurations semi-automatically. This is achieved by applying the metadynamics with the descriptor for the local atomic environment as a collective variable. As a result, the simulation is automatically steered toward unvisited local environment space such that each atom experiences diverse chemical environments without redundancy. We apply the proposed metadynamics sampling to H:Pt(111), GeTe, and Si systems. Throughout the examples, a small number of metadynamics trajectories can provide reference structures necessary for training high-fidelity MLPs. By proposing a semi-automatic sampling method tuned for MLPs, the present work paves the way to wider applications of MLPs to many challenging applications.
In this work, we discuss use of machine learning techniques for rapid prediction of detonation properties including explosive energy, detonation velocity, and detonation pressure. Further, analysis is applied to individual molecules in order to explo re the contribution of bonding motifs to these properties. Feature descriptors evaluated include Morgan fingerprints, E-state vectors, a custom sum over bonds descriptor, and coulomb matrices. Algorithms discussed include kernel ridge regression, least absolute shrinkage and selection operator (LASSO) regression, Gaussian process regression, and the multi-layer perceptron (a neural network). Effects of regularization, kernel selection, network parameters, and dimensionality reduction are discussed. We determine that even when using a small training set, non-linear regression methods may create models within a useful error tolerance for screening of materials.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا