Automated Generation and Ensemble-Learned Matching of X-ray Absorption Spectra


Abstract in English

We report the development of XASdb, a large database of computed reference X-ray absorption spectra (XAS), and a novel Ensemble-Learned Spectra IdEntification (ELSIE) algorithm for the matching of spectra. XASdb currently hosts more than 300,000 K-edge X-ray absorption near-edge spectra (XANES) for over 30,000 materials from the open-science Materials Project database. We discuss a high-throughput automation framework for FEFF calculations, built on robust, rigorously benchmarked parameters. We will demonstrate that the ELSIE algorithm, which combines 33 weak learners comprising a set of preprocessing steps and a similarity metric, can achieve up to 84.2% accuracy in identifying the correct oxidation state and coordination environment of a test set of 19 K-edge XANES spectra encompassing a diverse range of chemistries and crystal structures. The XASdb with the ELSIE algorithm has been integrated into a web application in the Materials Project, providing an important new public resource for the analysis of XAS to all materials researchers. Finally, the ELSIE algorithm itself has been made available as part of Veidt, an open source machine learning library for materials science.

Download