No Arabic abstract
Calibration models have been developed for determination of trace elements, silver for instance, in soil using laser-induced breakdown spectroscopy (LIBS). The major concern is the matrix effect. Although it affects the accuracy of LIBS measurements in a general way, the effect appears accentuated for soil because of large variation of chemical and physical properties among different soils. The purpose is to reduce its influence in such way an accurate and soil-independent calibration model can be constructed. At the same time, the developed model should efficiently reduce experimental fluctuations affecting measurement precision. A univariate model first reveals obvious influence of matrix effect and important experimental fluctuation. A multivariate model has been then developed. A key point is the introduction of generalized spectrum where variables representing the soil type are explicitly included. Machine learning has been used to develop the model. After a necessary pretreatment where a feature selection process reduces the dimension of raw spectrum accordingly to the number of available spectra, the data have been fed in to a back-propagation neuronal networks (BPNN) to train and validate the model. The resulted soilindependent calibration model allows average relative error of calibration (REC) and average relative error of prediction (REP) within the range of 5-6%.
The excited state dynamics of chromophores in complex environments determine a range of vital biological and energy capture processes. Time-resolved, multidimensional optical spectroscopies provide a key tool to investigate these processes. Although theory has the potential to decode these spectra in terms of the electronic and atomistic dynamics, the need for large numbers of excited state electronic structure calculations severely limits first principles predictions of multidimensional optical spectra for chromophores in the condensed phase. Here, we leverage the locality of chromophore excitations to develop machine learning models to predict the excited state energy gap of chromophores in complex environments for efficiently constructing linear and multidimensional optical spectra. By analyzing the performance of these models, which span a hierarchy of physical approximations, across a range of chromophore-environment interaction strengths, we provide strategies for the construction of ML models that greatly accelerate the calculation of multidimensional optical spectra from first principles.
Fast and inexpensive characterization of materials properties is a key element to discover novel functional materials. In this work, we suggest an approach employing three classes of Bayesian machine learning (ML) models to correlate electronic absorption spectra of nanoaggregates with the strength of intermolecular electronic couplings in organic conducting and semiconducting materials. As a specific model system, we consider PEDOT:PSS, a cornerstone material for organic electronic applications, and so analyze the couplings between charged dimers of closely packed PEDOT oligomers that are at the heart of the materials unrivaled conductivity. We demonstrate that ML algorithms can identify correlations between the coupling strengths and the electronic absorption spectra. We also show that ML models can be trained to be transferable across a broad range of spectral resolutions, and that the electronic couplings can be predicted from the simulated spectra with an 88 % accuracy when ML models are used as classifiers. Although the ML models employed in this study were trained on data generated by a multi-scale computational workflow, they were able to leverage leverage experimental data.
The advancement of science as outlined by Popper and Kuhn is largely qualitative, but with bibliometric data it is possible and desirable to develop a quantitative picture of scientific progress. Furthermore it is also important to allocate finite resources to research topics that have growth potential, to accelerate the process from scientific breakthroughs to technological innovations. In this paper, we address this problem of quantitative knowledge evolution by analysing the APS publication data set from 1981 to 2010. We build the bibliographic coupling and co-citation networks, use the Louvain method to detect topical clusters (TCs) in each year, measure the similarity of TCs in consecutive years, and visualize the results as alluvial diagrams. Having the predictive features describing a given TC and its known evolution in the next year, we can train a machine learning model to predict future changes of TCs, i.e., their continuing, dissolving, merging and splitting. We found the number of papers from certain journals, the degree, closeness, and betweenness to be the most predictive features. Additionally, betweenness increases significantly for merging events, and decreases significantly for splitting events. Our results represent a first step from a descriptive understanding of the Science of Science (SciSci), towards one that is ultimately prescriptive.
The HyChem approach has recently been proposed for modeling high-temperature combustion of real, multi-component fuels. The approach combines lumped reaction steps for fuel thermal and oxidative pyrolysis with detailed chemistry for the oxidation of the resulting pyrolysis products. However, the approach usually shows substantial discrepancies with experimental data within the Negative Temperature Coefficient (NTC) regime, as the low-temperature chemistry is more fuel-specific than high-temperature chemistry. This paper proposes a machine learning approach to learn the HyChem models that can cover both high-temperature and low-temperature regimes. Specifically, we develop a HyChem model using the experimental datasets of ignition delay times covering a wide range of temperatures and equivalence ratios. The chemical kinetic model is treated as a neural network model, and we then employ stochastic gradient descent (SGD), a technique that was developed for deep learning, for the training. We demonstrate the approach in learning the HyChem model for F-24, which is a Jet-A derived fuel, and compare the results with previous work employing genetic algorithms. The results show that the SGD approach can achieve comparable model performance with genetic algorithms but the computational cost is reduced by 1000 times. In addition, with regularization in SGD, the SGD approach changes the kinetic parameters from their original values much less than genetic algorithm and is thus more likely to retrain mechanistic meanings. Finally, our approach is built upon open-source packages and can be applied to the development and optimization of chemical kinetic models for internal combustion engine simulations.
Interpretability of learning-to-rank models is a crucial yet relatively under-examined research area. Recent progress on interpretable ranking models largely focuses on generating post-hoc explanations for existing black-box ranking models, whereas the alternative option of building an intrinsically interpretable ranking model with transparent and self-explainable structure remains unexplored. Developing fully-understandable ranking models is necessary in some scenarios (e.g., due to legal or policy constraints) where post-hoc methods cannot provide sufficiently accurate explanations. In this paper, we lay the groundwork for intrinsically interpretable learning-to-rank by introducing generalized additive models (GAMs) into ranking tasks. Generalized additive models (GAMs) are intrinsically interpretable machine learning models and have been extensively studied on regression and classification tasks. We study how to extend GAMs into ranking models which can handle both item-level and list-level features and propose a novel formulation of ranking GAMs. To instantiate ranking GAMs, we employ neural networks instead of traditional splines or regression trees. We also show that our neural ranking GAMs can be distilled into a set of simple and compact piece-wise linear functions that are much more efficient to evaluate with little accuracy loss. We conduct experiments on three data sets and show that our proposed neural ranking GAMs can achieve significantly better performance than other traditional GAM baselines while maintaining similar interpretability.