No Arabic abstract
We propose a novel active learning scheme for automatically sampling a minimum number of uncorrelated configurations for fitting the Gaussian Approximation Potential (GAP). Our active learning scheme consists of an unsupervised machine learning (ML) scheme coupled to Bayesian optimization technique that evaluates the GAP model. We apply this scheme to a Hafnium dioxide (HfO2) dataset generated from a melt-quench ab initio molecular dynamics (AIMD) protocol. Our results show that the active learning scheme, with no prior knowledge of the dataset is able to extract a configuration that reaches the required energy fit tolerance. Further, molecular dynamics (MD) simulations performed using this active learned GAP model on 6144-atom systems of amorphous and liquid state elucidate the structural properties of HfO2 with near ab initio precision and quench rates (i.e. 1.0 K/ps) not accessible via AIMD. The melt and amorphous x-ray structural factors generated from our simulation are in good agreement with experiment. Additionally, the calculated diffusion constants are in good agreement with previous ab initio studies.
Drive towards improved performance of machine learning models has led to the creation of complex features representing a database of condensed matter systems. The complex features, however, do not offer an intuitive explanation on which physical attributes do improve the performance. The effect of the database on the performance of the trained model is often neglected. In this work we seek to understand in depth the effect that the choice of features and the properties of the database have on a machine learning application. In our experiments, we consider the complex phase space of carbon as a test case, for which we use a set of simple, human understandable and cheaply computable features for the aim of predicting the total energy of the crystal structure. Our study shows that (i) the performance of the machine learning model varies depending on the set of features and the database, (ii) is not transferable to every structure in the phase space and (iii) depends on how well structures are represented in the database.
Understanding the structure and properties of refractory oxides are critical for high temperature applications. In this work, a combined experimental and simulation approach uses an automated closed loop via an active-learner, which is initialized by X-ray and neutron diffraction measurements, and sequentially improves a machine-learning model until the experimentally predetermined phase space is covered. A multi-phase potential is generated for a canonical example of the archetypal refractory oxide, HfO2, by drawing a minimum number of training configurations from room temperature to the liquid state at ~2900oC. The method significantly reduces model development time and human effort.
The need for advanced materials has led to the development of complex, multi-component alloys or solid-solution alloys. These materials have shown exceptional properties like strength, toughness, ductility, electrical and electronic properties. Current development of such material systems are hindered by expensive experiments and computationally demanding first-principles simulations. Atomistic simulations can provide reasonable insights on properties in such material systems. However, the issue of designing robust potentials still exists. In this paper, we explore a deep convolutional neural-network based approach to develop the atomistic potential for such complex alloys to investigate materials for insights into controlling properties. In the present work, we propose a voxel representation of the atomic configuration of a cell and design a 3D convolutional neural network to learn the interaction of the atoms. Our results highlight the performance of the 3D convolutional neural network and its efficacy in machine-learning the atomistic potential. We also explore the role of voxel resolution and provide insights into the two bounding box methodologies implemented for voxelization.
Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate: (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations, and over a range of material datasets. Representations investigated include Atom Centred Symmetry Functions, Chebyshev Polynomial Symmetry Functions (CHSF), Smooth Overlap of Atomic Positions, Many-body Tensor Representation and Atomic Cluster Expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations, and that for CHSF there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision, and further that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.
We introduce a Gaussian approximation potential (GAP) for atomistic simulations of liquid and amorphous elemental carbon. Based on a machine-learning representation of the density-functional theory (DFT) potential-energy surface, such interatomic potentials enable materials simulations with close-to DFT accuracy but at much lower computational cost. We first determine the maximum accuracy that any finite-range potential can achieve in carbon structures; then, using a novel hierarchical set of two-, three-, and many-body structural descriptors, we construct a GAP model that can indeed reach the target accuracy. The potential yields accurate energetic and structural properties over a wide range of densities; it also correctly captures the structure of the liquid phases, at variance with state-of-the-art empirical potentials. Exemplary applications of the GAP model to surfaces of diamond-like tetrahedral amorphous carbon (ta-C) are presented, including an estimate of the amorphous materials surface energy, and simulations of high-temperature surface reconstructions (graphitization). The new interatomic potential appears to be promising for realistic and accurate simulations of nanoscale amorphous carbon structures.