ترغب بنشر مسار تعليمي؟ اضغط هنا

Reliable and Explainable Machine Learning Methods for Accelerated Material Discovery

376   0   0.0 ( 0 )
 نشر من قبل Bhavya Kailkhura
 تاريخ النشر 2019
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

Material scientists are increasingly adopting the use of machine learning (ML) for making potentially important decisions, such as, discovery, development, optimization, synthesis and characterization of materials. However, despite MLs impressive performance in commercial applications, several unique challenges exist when applying ML in materials science applications. In such a context, the contributions of this work are twofold. First, we identify common pitfalls of existing ML techniques when learning from underrepresented/imbalanced material data. Specifically, we show that with imbalanced data, standard methods for assessing quality of ML models break down and lead to misleading conclusions. Furthermore, we found that the models own confidence score cannot be trusted and model introspection methods (using simpler models) do not help as they result in loss of predictive performance (reliability-explainability trade-off). Second, to overcome these challenges, we propose a general-purpose explainable and reliable machine-learning framework. Specifically, we propose a novel pipeline that employs an ensemble of simpler models to reliably predict material properties. We also propose a transfer learning technique and show that the performance loss due to models simplicity can be overcome by exploiting correlations among different material properties. A new evaluation metric and a trust score to better quantify the confidence in the predictions are also proposed. To improve the interpretability, we add a rationale generator component to our framework which provides both model-level and decision-level explanations. Finally, we demonstrate the versatility of our technique on two applications: 1) predicting properties of crystalline compounds, and 2) identifying novel potentially stable solar cell materials.

قيم البحث

اقرأ أيضاً

The discovery of new multicomponent inorganic compounds can provide direct solutions to many scientific and engineering challenges, yet the vast size of the uncharted material space dwarfs current synthesis throughput. While the computational crystal structure prediction is expected to mitigate this frustration, the NP-hardness and steep costs of density functional theory (DFT) calculations prohibit material exploration at scale. Herein, we introduce SPINNER, a highly efficient and reliable structure-prediction framework based on exhaustive random searches and evolutionary algorithms, which is completely free from empiricism. Empowered by accurate neural network potentials, the program can navigate the configuration space faster than DFT by more than 10$^{2}$-fold. In blind tests on 60 ternary compositions diversely selected from the experimental database, SPINNER successfully identifies experimental (or theoretically more stable) phases for ~80% of materials within 5000 generations, entailing up to half a million structure evaluations for each composition. When benchmarked against previous data mining or DFT-based evolutionary predictions, SPINNER identifies more stable phases in the majority of cases. By developing a reliable and fast structure-prediction framework, this work opens the door to large-scale, unbounded computational exploration of undiscovered inorganic crystals.
Abstract Machine learning models, trained on data from ab initio quantum simulations, are yielding molecular dynamics potentials with unprecedented accuracy. One limiting factor is the quantity of available training data, which can be expensive to ob tain. A quantum simulation often provides all atomic forces, in addition to the total energy of the system. These forces provide much more information than the energy alone. It may appear that training a model to this large quantity of force data would introduce significant computational costs. Actually, training to all available force data should only be a few times more expensive than training to energies alone. Here, we present a new algorithm for efficient force training, and benchmark its accuracy by training to forces from real-world datasets for organic chemistry and bulk aluminum.
Exciting advances have been made in artificial intelligence (AI) during the past decades. Among them, applications of machine learning (ML) and deep learning techniques brought human-competitive performances in various tasks of fields, including imag e recognition, speech recognition and natural language understanding. Even in Go, the ancient game of profound complexity, the AI player already beat human world champions convincingly with and without learning from human. In this work, we show that our unsupervised machines (Atom2Vec) can learn the basic properties of atoms by themselves from the extensive database of known compounds and materials. These learned properties are represented in terms of high dimensional vectors, and clustering of atoms in vector space classifies them into meaningful groups in consistent with human knowledge. We use the atom vectors as basic input units for neural networks and other ML models designed and trained to predict materials properties, which demonstrate significant accuracy.
Solar-energy plays an important role in solving serious environmental problems and meeting high-energy demand. However, the lack of suitable materials hinders further progress of this technology. Here, we present the largest inorganic solar-cell mate rial search to date using density functional theory (DFT) and machine-learning approaches. We calculated the spectroscopic limited maximum efficiency (SLME) using Tran-Blaha modified Becke-Johnson potential for 5097 non-metallic materials and identified 1997 candidates with an SLME higher than 10%, including 934 candidates with suitable convex-hull stability and effective carrier mass. Screening for 2D-layered cases, we found 58 potential materials and performed G0W0 calculations on a subset to estimate the prediction-uncertainty. As the above DFT methods are still computationally expensive, we developed a high accuracy machine learning model to pre-screen efficient materials and applied it to over a million materials. Our results provide a general framework and universal strategy for the design of high-efficiency solar cell materials. The data and tools are publicly distributed at: https://www.ctcms.nist.gov/~knc6/JVASP.html, https://www.ctcms.nist.gov/jarvisml/, https://jarvis.nist.gov/ and https://github.com/usnistgov/jarvis .
We present a novel deep learning (DL) approach to produce highly accurate predictions of macroscopic physical properties of solid solution binary alloys and magnetic systems. The major idea is to make use of the correlations between different physica l properties in alloy systems to improve the prediction accuracy of neural network (NN) models. We use multitasking NN models to simultaneously predict the total energy, charge density and magnetic moment. These physical properties mutually serve as constraints during the training of the multitasking NN, resulting in more reliable DL models because multiple physics properties are correctly learned by a single model. Two binary alloys, copper-gold (CuAu) and iron-platinum (FePt), were studied. Our results show that once the multitasking NNs are trained, they can estimate the material properties for a specific configuration hundreds of times faster than first-principles density functional theory calculations while retaining comparable accuracy. We used a simple measure based on the root-mean-squared errors (RMSE) to quantify the quality of the NN models, and found that the inclusion of charge density and magnetic moment as physical constraints leads to more stable models that exhibit improved accuracy and reduced uncertainty for the energy predictions.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا