No Arabic abstract
In this work, we discuss use of machine learning techniques for rapid prediction of detonation properties including explosive energy, detonation velocity, and detonation pressure. Further, analysis is applied to individual molecules in order to explore the contribution of bonding motifs to these properties. Feature descriptors evaluated include Morgan fingerprints, E-state vectors, a custom sum over bonds descriptor, and coulomb matrices. Algorithms discussed include kernel ridge regression, least absolute shrinkage and selection operator (LASSO) regression, Gaussian process regression, and the multi-layer perceptron (a neural network). Effects of regularization, kernel selection, network parameters, and dimensionality reduction are discussed. We determine that even when using a small training set, non-linear regression methods may create models within a useful error tolerance for screening of materials.
We present a proof of concept that machine learning techniques can be used to predict the properties of CNOHF energetic molecules from their molecular structures. We focus on a small but diverse dataset consisting of 109 molecular structures spread across ten compound classes. Up until now, candidate molecules for energetic materials have been screened using predictions from expensive quantum simulations and thermochemical codes. We present a comprehensive comparison of machine learning models and several molecular featurization methods - sum over bonds, custom descriptors, Coulomb matrices, bag of bonds, and fingerprints. The best featurization was sum over bonds (bond counting), and the best model was kernel ridge regression. Despite having a small data set, we obtain acceptable errors and Pearson correlations for the prediction of detonation pressure, detonation velocity, explosive energy, heat of formation, density, and other properties out of sample. By including another dataset with 309 additional molecules in our training we show how the error can be pushed lower, although the convergence with number of molecules is slow. Our work paves the way for future applications of machine learning in this domain, including automated lead generation and interpreting machine learning models to obtain novel chemical insights.
Accurate and efficient calculations of absorption spectra of molecules and materials are essential for the understanding and rational design of broad classes of systems. Solving the Bethe-Salpeter equation (BSE) for electron-hole pairs usually yields accurate predictions of absorption spectra, but it is computationally expensive, especially if thermal averages of spectra computed for multiple configurations are required. We present a method based on machine learning to evaluate a key quantity entering the definition of absorption spectra: the dielectric screening. We show that our approach yields a model for the screening that is transferable between multiple configurations sampled during first principles molecular dynamics simulations; hence it leads to a substantial improvement in the efficiency of calculations of finite temperature spectra. We obtained computational gains of one to two orders of magnitude for systems with 50 to 500 atoms, including liquids, solids, nanostructures, and solid/liquid interfaces. Importantly, the models of dielectric screening derived here may be used not only in the solution of the BSE but also in developing functionals for time-dependent density functional theory (TDDFT) calculations of homogeneous and heterogeneous systems. Overall, our work provides a strategy to combine machine learning with electronic structure calculations to accelerate first principles simulations of excited-state properties.
Synthesis of advanced inorganic materials with minimum number of trials is of paramount importance towards the acceleration of inorganic materials development. The enormous complexity involved in existing multi-variable synthesis methods leads to high uncertainty, numerous trials and exorbitant cost. Recently, machine learning (ML) has demonstrated tremendous potential for material research. Here, we report the application of ML to optimize and accelerate material synthesis process in two representative multi-variable systems. A classification ML model on chemical vapor deposition-grown MoS2 is established, capable of optimizing the synthesis conditions to achieve higher success rate. While a regression model is constructed on the hydrothermal-synthesized carbon quantum dots, to enhance the process-related properties such as the photoluminescence quantum yield. Progressive adaptive model is further developed, aiming to involve ML at the beginning stage of new material synthesis. Optimization of the experimental outcome with minimized number of trials can be achieved with the effective feedback loops. This work serves as proof of concept revealing the feasibility and remarkable capability of ML to facilitate the synthesis of inorganic materials, and opens up a new window for accelerating material development.
Faithfully representing chemical environments is essential for describing materials and molecules with machine learning approaches. Here, we present a systematic classification of these representations and then investigate: (i) the sensitivity to perturbations and (ii) the effective dimensionality of a variety of atomic environment representations, and over a range of material datasets. Representations investigated include Atom Centred Symmetry Functions, Chebyshev Polynomial Symmetry Functions (CHSF), Smooth Overlap of Atomic Positions, Many-body Tensor Representation and Atomic Cluster Expansion. In area (i), we show that none of the atomic environment representations are linearly stable under tangential perturbations, and that for CHSF there are instabilities for particular choices of perturbation, which we show can be removed with a slight redefinition of the representation. In area (ii), we find that most representations can be compressed significantly without loss of precision, and further that selecting optimal subsets of a representation method improves the accuracy of regression models built for a given dataset.
Thermoelectric conversion using Seebeck effect for generation of electricity is becoming an indispensable technology for energy harvesting and smart thermal management. Recently, the spin-driven thermoelectric effects (STEs), which employ emerging phenomena such as the spin-Seebeck effect (SSE) and the anomalous Nernst effect (ANE), have garnered much attention as a promising path towards low cost and versatile thermoelectric technology with easily scalable manufacturing. However, progress in development of STE devices is hindered by the lack of understanding of the mechanism and materials parameters that govern the STEs. To address this problem, we enlist machine learning modeling to establish the key physical parameters controlling SSE. Guided by these models, we have carried out a high-throughput experiment which led to the identification of a novel STE material with a thermopower an order of magnitude larger than that of the current generation STE devices.