Do you want to publish a course? Click here

Exponentially-Modified Gaussian Mixture Model: Applications in Spectroscopy

152   0   0.0 ( 0 )
 Added by Sebastian Ament
 Publication date 2019
and research's language is English




Ask ChatGPT about the research

We propose a novel exponentially-modified Gaussian (EMG) mixture residual model. The EMG mixture is well suited to model residuals that are contaminated by a distribution with positive support. This is in contrast to commonly used robust residual models, like the Huber loss or $ell_1$, which assume a symmetric contaminating distribution and are otherwise asymptotically biased. We propose an expectation-maximization algorithm to optimize an arbitrary model with respect to the EMG mixture. We apply the approach to linear regression and probabilistic matrix factorization (PMF). We compare against other residual models, including quantile regression. Our numerical experiments demonstrate the strengths of the EMG mixture on both tasks. The PMF model arises from considering spectroscopic data. In particular, we demonstrate the effectiveness of PMF in conjunction with the EMG mixture model on synthetic data and two real-world applications: X-ray diffraction and Raman spectroscopy. We show how our approach is effective in inferring background signals and systematic errors in data arising from these experimental settings, dramatically outperforming existing approaches and revealing the datas physically meaningful components.

rate research

Read More

Generalized Chinese Remainder Theorem (CRT) has been shown to be a powerful approach to solve the ambiguity resolution problem. However, with its close relationship to number theory, study in this area is mainly from a coding theory perspective under deterministic conditions. Nevertheless, it can be proved that even with the best deterministic condition known, the probability of success in robust reconstruction degrades exponentially as the number of estimand increases. In this paper, we present the first rigorous analysis on the underlying statistical model of CRT-based multiple parameter estimation, where a generalized Gaussian mixture with background knowledge on samplings is proposed. To address the problem, two novel approaches are introduced. One is to directly calculate the conditional maximal a posteriori probability (MAP) estimation of residue clustering, and the other is to iteratively search for MAP of both common residues and clustering. Moreover, remainder error-correcting codes are introduced to improve the robustness further. It is shown that this statistically based scheme achieves much stronger robustness compared to state-of-the-art deterministic schemes, especially in low and median Signal Noise Ratio (SNR) scenarios.
We consider the problem of clustering datasets in the presence of arbitrary outliers. Traditional clustering algorithms such as k-means and spectral clustering are known to perform poorly for datasets contaminated with even a small number of outliers. In this paper, we develop a provably robust spectral clustering algorithm that applies a simple rounding scheme to denoise a Gaussian kernel matrix built from the data points and uses vanilla spectral clustering to recover the cluster labels of data points. We analyze the performance of our algorithm under the assumption that the good data points are generated from a mixture of sub-gaussians (we term these inliers), while the outlier points can come from any arbitrary probability distribution. For this general class of models, we show that the misclassification error decays at an exponential rate in the signal-to-noise ratio, provided the number of outliers is a small fraction of the inlier points. Surprisingly, this derived error bound matches with the best-known bound for semidefinite programs (SDPs) under the same setting without outliers. We conduct extensive experiments on a variety of simulated and real-world datasets to demonstrate that our algorithm is less sensitive to outliers compared to other state-of-the-art algorithms proposed in the literature.
339 - Emilie Kaufmann 2018
This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and by multiplying those martingales. Our deviation inequalities allow us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.
We consider estimating the parameters of a Gaussian mixture density with a given number of components best representing a given set of weighted samples. We adopt a density interpretation of the samples by viewing them as a discrete Dirac mixture density over a continuous domain with weighted components. Hence, Gaussian mixture fitting is viewed as density re-approximation. In order to speed up computation, an expectation-maximization method is proposed that properly considers not only the sample locations, but also the corresponding weights. It is shown that methods from literature do not treat the weights correctly, resulting in wrong estimates. This is demonstrated with simple counterexamples. The proposed method works in any number of dimensions with the same computational load as standard Gaussian mixture estimators for unweighted samples.
72 - Zhongyang Li 2020
We study the community detection problem on a Gaussian mixture model, in which vertices are divided into $kgeq 2$ distinct communities. The major difference in our model is that the intensities for Gaussian perturbations are different for different entries in the observation matrix, and we do not assume that every community has the same number of vertices. We explicitly find the threshold for the exact recovery of the maximum likelihood estimation. Applications include the community detection on hypergraphs.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا