ترغب بنشر مسار تعليمي؟ اضغط هنا

Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion

94   0   0.0 ( 0 )
 نشر من قبل Robert Bamler
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not been explored to-date. In this report, we propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures. Although no physical descriptors for the considered components were used, our method outperforms the state-of-the-art method that has been refined over three decades while requiring much less training effort. This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures with the potential to revolutionize modeling and simulation in chemical engineering.



قيم البحث

اقرأ أيضاً

124 - Antoine Ledent , Rodrigo Alves , 2020
We propose orthogonal inductive matrix completion (OMIC), an interpretable approach to matrix completion based on a sum of multiple orthonormal side information terms, together with nuclear-norm regularization. The approach allows us to inject prio r knowledge about the singular vectors of the ground truth matrix. We optimize the approach by a provably converging algorithm, which optimizes all components of the model simultaneously. We study the generalization capabilities of our method in both the distribution-free setting and in the case where the sampling distribution admits uniform marginals, yielding learning guarantees that improve with the quality of the injected knowledge in both cases. As particular cases of our framework, we present models which can incorporate user and item biases or community information in a joint and additive fashion. We analyse the performance of OMIC on several synthetic and real datasets. On synthetic datasets with a sliding scale of user bias relevance, we show that OMIC better adapts to different regimes than other methods. On real-life datasets containing user/items recommendations and relevant side information, we find that OMIC surpasses the state-of-the-art, with the added benefit of greater interpretability.
A Global Navigation Satellite System (GNSS) uses a constellation of satellites around the earth for accurate navigation, timing, and positioning. Natural phenomena like space weather introduce irregularities in the Earths ionosphere, disrupting the p ropagation of the radio signals that GNSS relies upon. Such disruptions affect both the amplitude and the phase of the propagated waves. No physics-based model currently exists to predict the time and location of these disruptions with sufficient accuracy and at relevant scales. In this paper, we focus on predicting the phase fluctuations of GNSS radio waves, known as phase scintillations. We propose a novel architecture and loss function to predict 1 hour in advance the magnitude of phase scintillations within a time window of plus-minus 5 minutes with state-of-the-art performance.
We give an online algorithm and prove novel mistake and regret bounds for online binary matrix completion with side information. The mistake bounds we prove are of the form $tilde{O}(D/gamma^2)$. The term $1/gamma^2$ is analogous to the usual margin term in SVM (perceptron) bounds. More specifically, if we assume that there is some factorization of the underlying $m times n$ matrix into $P Q^intercal$ where the rows of $P$ are interpreted as classifiers in $mathcal{R}^d$ and the rows of $Q$ as instances in $mathcal{R}^d$, then $gamma$ is the maximum (normalized) margin over all factorizations $P Q^intercal$ consistent with the observed matrix. The quasi-dimension term $D$ measures the quality of side information. In the presence of vacuous side information, $D= m+n$. However, if the side information is predictive of the underlying factorization of the matrix, then in an ideal case, $D in O(k + ell)$ where $k$ is the number of distinct row factors and $ell$ is the number of distinct column factors. We additionally provide a generalization of our algorithm to the inductive setting. In this setting, we provide an example where the side information is not directly specified in advance. For this example, the quasi-dimension $D$ is now bounded by $O(k^2 + ell^2)$.
Due to the insufficient measurements in the distribution system state estimation (DSSE), full observability and redundant measurements are difficult to achieve without using the pseudo measurements. The matrix completion state estimation (MCSE) combi nes the matrix completion and power system model to estimate voltage by exploring the low-rank characteristics of the matrix. This paper proposes a robust matrix completion state estimation (RMCSE) to estimate the voltage in a distribution system under a low-observability condition. Tradition state estimation weighted least squares (WLS) method requires full observability to calculate the states and needs redundant measurements to proceed a bad data detection. The proposed method improves the robustness of the MCSE to bad data by minimizing the rank of the matrix and measurements residual with different weights. It can estimate the system state in a low-observability system and has robust estimates without the bad data detection process in the face of multiple bad data. The method is numerically evaluated on the IEEE 33-node radial distribution system. The estimation performance and robustness of RMCSE are compared with the WLS with the largest normalized residual bad data identification (WLS-LNR), and the MCSE.
Smooth power generation from solar stations demand accurate, reliable and efficient forecast of solar energy for optimal integration to cater market demand; however, the implicit instability of solar energy production may cause serious problems for t he smooth power generation. We report daily prediction of solar energy by exploiting the strength of machine learning techniques to capture and analyze complicated behavior of enormous features effectively. For this purpose, dataset comprising of 98 solar stations has been taken from energy competition of American Meteorological Society (AMS) for predicting daily solar energy. Forecast models of base line regressors including linear, ridge, lasso, decision tree, random forest and artificial neural networks have been implemented on the AMS solar dataset. Grid size is converted into two sections: 16x9 and 10x4 to ascertain attributes contributing more towards the generated power from densely located stations on global ensemble forecast system (GEFS). To evaluate the models, statistical measures of prediction error in terms of RMSE, MAE and R2_score have been analyzed and compared with the existing techniques. It has been observed that improved accuracy is achieved through random forest and ridge regressor for both grid sizes in contrast to all other proposed methods. Stability and reliability of the proposed schemes are evaluated on a single solar station as well as on multiple independent runs.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا