An Investigation of Machine Learning Methods Applied to Structure Prediction in Condensed Matter

102 0 0.0 ( 0 )

Download Cite

Added by William Brouwer

Publication date 2014

fields Physics

and research's language is English

Authors William J. Brouwer - James D. Kubicki - Jorge O. Sofo

Materials Science

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Materials characterization remains a significant, time-consuming undertaking. Generally speaking, spectroscopic techniques are used in conjunction with empirical and ab-initio calculations in order to elucidate structure. These experimental and computational methods typically require significant human input and interpretation, particularly with regards to novel materials. Recently, the application of data mining and machine learning to problems in material science have shown great promise in reducing this overhead. In the work presented here, several aspects of machine learning are explored with regards to characterizing a model material, titania, using solid-state Nuclear Magnetic Resonance (NMR). Specifically, a large dataset is generated, corresponding to NMR $^{47}$Ti spectra, using ab-initio calculations for generated TiO$_2$ structures. Principal Components Analysis (PCA) reveals that input spectra may be compressed by more than 90%, before being used for subsequent machine learning. Two key methods are used to learn the complex mapping between structural details and input NMR spectra, demonstrating excellent accuracy when presented with test sample spectra. This work compares Support Vector Regression (SVR) and Artificial Neural Networks (ANNs), as one step towards the construction of an expert system for solid state materials characterization.

rate research

Learning the electronic density of states in condensed matter

118 - Chiheb Ben Mahmoud , Andrea Anelli , Gabor Csanyi 2020

The electronic density of states (DOS) quantifies the distribution of the energy levels that can be occupied by electrons in a quasiparticle picture, and is central to modern electronic structure theory. It also underpins the computation and interpretation of experimentally observable material properties such as optical absorption and electrical conductivity. We discuss the challenges inherent in the construction of a machine-learning (ML) framework aimed at predicting the DOS as a combination of local contributions that depend in turn on the geometric configuration of neighbours around each atom, using quasiparticle energy levels from density functional theory as training data. We present a challenging case study that includes configurations of silicon spanning a broad set of thermodynamic conditions, ranging from bulk structures to clusters, and from semiconducting to metallic behavior. We compare different approaches to represent the DOS, and the accuracy of predicting quantities such as the Fermi level, the DOS at the Fermi level, or the band energy, either directly or as a side-product of the evaluation of the DOS. The performance of the model depends crucially on the smoothening of the DOS, and there is a tradeoff to be made between the systematic error associated with the smoothening and the error in the ML model for a specific structure. We demonstrate the usefulness of this approach by computing the density of states of a large amorphous silicon sample, for which it would be prohibitively expensive to compute the DOS by direct electronic structure calculations, and show how the atom-centred decomposition of the DOS that is obtained through our model can be used to extract physical insights into the connections between structural and electronic features.

Materials Science Machine Learning

Review of Machine-Learning Methods for RNA Secondary Structure Prediction

113 - Qi Zhao , Zheng Zhao , Xiaoya Fan 2020

Secondary structure plays an important role in determining the function of non-coding RNAs. Hence, identifying RNA secondary structures is of great value to research. Computational prediction is a mainstream approach for predicting RNA secondary structure. Unfortunately, even though new methods have been proposed over the past 40 years, the performance of computational prediction methods has stagnated in the last decade. Recently, with the increasing availability of RNA structure data, new methods based on machine-learning technologies, especially deep learning, have alleviated the issue. In this review, we provide a comprehensive overview of RNA secondary structure prediction methods based on machine-learning technologies and a tabularized summary of the most important methods in this field. The current pending issues in the field of RNA secondary structure prediction and future trends are also discussed.

Biomolecules Machine Learning Machine Learning

Describing condensed matter from atomically resolved imaging data: from structure to generative and causal models

343 - Sergei V. Kalinin , Ayana Ghosh , Rama Vasudevan 2021

The development of high-resolution imaging methods such as electron and scanning probe microscopy and atomic probe tomography have provided a wealth of information on structure and functionalities of solids. The availability of this data in turn necessitates development of approaches to derive quantitative physical information, much like the development of scattering methods in the early XX century which have given one of the most powerful tools in condensed matter physics arsenal. Here, we argue that this transition requires adapting classical macroscopic definitions, that can in turn enable fundamentally new opportunities in understanding physics and chemistry. For example, many macroscopic definitions such as symmetry can be introduced locally only in a Bayesian sense, balancing the prior knowledge of materials physics and experimental data to yield posterior probability distributions. At the same time, a wealth of local data allows fundamentally new approaches for the description of solids based on construction of statistical and physical generative models, akin to Ginzburg-Landau thermodynamic models. Finally, we note that availability of observational data opens pathways towards exploring causal mechanisms underpinning solid structure and functionality.

Materials Science

Machine learning approaches for feature engineering of the crystal structure: Application to the prediction of the formation energy of cubic compounds

62 - Prathik R. Kaundinya , Kamal Choudhary , Surya R. Kalidindi 2021

In this study, we present a novel approach along with the needed computational strategies for efficient and scalable feature engineering of the crystal structure in compounds of different chemical compositions. This approach utilizes a versatile and extensible framework for the quantification of a three-dimensional (3-D) voxelized crystal structure in the form of 2-point spatial correlations of multiple atomic attributes and performs principal component analysis to extract the low-dimensional features that could be used to build surrogate models for material properties of interest. An application of the proposed feature engineering framework is demonstrated on a case study involving the prediction of the formation energies of crystalline compounds using two vastly different surrogate model building strategies - local Gaussian process regression and neural networks. Specifically, it is shown that the top 25 features (i.e., principal component scores) identified by the proposed framework serve as good regressors for the formation energy of the crystalline substance for both model building strategies.

Materials Science

Ternary mixed-anion semiconductors with tunable band gaps from machine-learning and crystal structure prediction

66 - Maximilian Amsler , Logan Ward , Vinay I. Hegde 2018

We report the computational investigation of a series of ternary X$_4$Y$_2$Z and X$_5$Y$_2$Z$_2$ compounds with X={Mg, Ca, Sr, Ba}, Y={P, As, Sb, Bi}, and Z={S, Se, Te}. The compositions for these materials were predicted through a search guided by machine learning, while the structures were resolved using the minima hopping crystal structure prediction method. Based on $textit{ab initio}$ calculations, we predict that many of these compounds are thermodynamically stable. In particular, 21 of the X$_4$Y$_2$Z compounds crystallize in a tetragonal structure with $textit{I-42d}$ symmetry, and exhibit band gaps in the range of 0.3 and 1.8 eV, well suited for various energy applications. We show that several candidate compounds (in particular X$_4$Y$_2$Te and X$_4$Sb$_2$Se) exhibit good photo absorption in the visible range, while others (e.g., Ba$_4$Sb$_2$Se) show excellent thermoelectric performance due to a high power factor and extremely low lattice thermal conductivities.

Materials Science