Learning to discover: expressive Gaussian mixture models for multi-dimensional simulation and parameter inference in the physical sciences

79 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Darren Price

تاريخ النشر 2021

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف Stephen B. Menary - Darren D. Price

تحليل البيانات والإحصاءات والاحتمال التعلم الآلي فيزياء الطاقة العالية - التجربة

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We show that density models describing multiple observables with (i) hard boundaries and (ii) dependence on external parameters may be created using an auto-regressive Gaussian mixture model. The model is designed to capture how observable spectra are deformed by hypothesis variations, and is made more expressive by projecting data onto a configurable latent space. It may be used as a statistical model for scientific discovery in interpreting experimental observations, for example when constraining the parameters of a physical model or tuning simulation parameters according to calibration data. The model may also be sampled for use within a Monte Carlo simulation chain, or used to estimate likelihood ratios for event classification. The method is demonstrated on simulated high-energy particle physics data considering the anomalous electroweak production of a $Z$ boson in association with a dijet system at the Large Hadron Collider, and the accuracy of inference is tested using a realistic toy example. The developed methods are domain agnostic; they may be used within any field to perform simulation or inference where a dataset consisting of many real-valued observables has conditional dependence on external parameters.

قيم البحث

218 - Soheil Kolouri , Gustavo K. Rohde , Heiko Hoffmann 2017

Gaussian mixture models (GMM) are powerful parametric tools with many applications in machine learning and computer vision. Expectation maximization (EM) is the most popular algorithm for estimating the GMM parameters. However, EM guarantees only con vergence to a stationary point of the log-likelihood function, which could be arbitrarily worse than the optimal solution. Inspired by the relationship between the negative log-likelihood function and the Kullback-Leibler (KL) divergence, we propose an alternative formulation for estimating the GMM parameters using the sliced Wasserstein distance, which gives rise to a new algorithm. Specifically, we propose minimizing the sliced-Wasserstein distance between the mixture model and the data distribution with respect to the GMM parameters. In contrast to the KL-divergence, the energy landscape for the sliced-Wasserstein distance is more well-behaved and therefore more suitable for a stochastic gradient descent scheme to obtain the optimal GMM parameters. We show that our formulation results in parameter estimates that are more robust to random initializations and demonstrate that it can estimate high-dimensional data distributions more faithfully than the EM algorithm.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي التعلم الالي

Deep learning for inferring cause of data anomalies

60 - V. Azzolini , M. Borisyak , G. Cerminara 2017

Daily operation of a large-scale experiment is a resource consuming task, particularly from perspectives of routine data quality monitoring. Typically, data comes from different sub-detectors and the global quality of data depends on the combinatoria l performance of each of them. In this paper, the problem of identifying channels in which anomalies occurred is considered. We introduce a generic deep learning model and prove that, under reasonable assumptions, the model learns to identify channels which are affected by an anomaly. Such model could be used for data quality manager cross-check and assistance and identifying good channels in anomalous data samples. The main novelty of the method is that the model does not require ground truth labels for each channel, only global flag is used. This effectively distinguishes the model from classical classification methods. Being applied to CMS data collected in the year 2010, this approach proves its ability to decompose anomaly by separate channels.

تحليل البيانات والإحصاءات والاحتمال التعلم الآلي فيزياء الطاقة العالية - التجربة

Representation Learning in Sequence to Sequence Tasks: Multi-filter Gaussian Mixture Autoencoder

92 - Yunhao Yang , Zhaokun Xue 2021

Heterogeneity of sentences exists in sequence to sequence tasks such as machine translation. Sentences with largely varied meanings or grammatical structures may increase the difficulty of convergence while training the network. In this paper, we int roduce a model to resolve the heterogeneity in the sequence to sequence task. The Multi-filter Gaussian Mixture Autoencoder (MGMAE) utilizes an autoencoder to learn the representations of the inputs. The representations are the outputs from the encoder, lying in the latent space whose dimension is the hidden dimension of the encoder. The representations of training data in the latent space are used to train Gaussian mixtures. The latent space representations are divided into several mixtures of Gaussian distributions. A filter (decoder) is tuned to fit the data in one of the Gaussian distributions specifically. Each Gaussian is corresponding to one filter so that the filter is responsible for the heterogeneity within this Gaussian. Thus the heterogeneity of the training data can be resolved. Comparative experiments are conducted on the Geo-query dataset and English-French translation. Our experiments show that compares to the traditional encoder-decoder model, this network achieves better performance on sequence to sequence tasks such as machine translation and question answering.

الحساب واللغة التعلم الآلي

A machine learning framework for computationally expensive transient models

84 - Prashant Kumar , Kushal Sinha , Nandkishor Nere 2019

The promise of machine learning has been explored in a variety of scientific disciplines in the last few years, however, its application on first-principles based computationally expensive tools is still in nascent stage. Even with the advances in co mputational resources and power, transient simulations of large-scale dynamic systems using a variety of the first-principles based computational tools are still limited. In this work, we propose an ensemble approach where we combine one such computationally expensive tool, called discrete element method (DEM), with a time-series forecasting method called auto-regressive integrated moving average (ARIMA) and machine-learning methods to significantly reduce the computational burden while retaining model accuracy and performance. The developed machine-learning model shows good predictability and agreement with the literature, demonstrating its tremendous potential in scientific computing.

تحليل البيانات والإحصاءات والاحتمال التعلم الآلي الفيزياء التطبيقية

Interpretability of machine-learning models in physical sciences

87 - Luca M. Ghiringhelli 2021

In machine learning (ML), it is in general challenging to provide a detailed explanation on how a trained model arrives at its prediction. Thus, usually we are left with a black-box, which from a scientific standpoint is not satisfactory. Even though numerous methods have been recently proposed to interpret ML models, somewhat surprisingly, interpretability in ML is far from being a consensual concept, with diverse and sometimes contrasting motivations for it. Reasonable candidate properties of interpretable models could be model transparency (i.e. how does the model work?) and post hoc explanations (i.e., what else can the model tell me?). Here, I review the current debate on ML interpretability and identify key challenges that are specific to ML applied to materials science.

علم المواد تحليل البيانات والإحصاءات والاحتمال