Do you want to publish a course? Click here

Gaussian Mixture Estimation from Weighted Samples

312   0   0.0 ( 0 )
 Added by Daniel Frisch
 Publication date 2021
and research's language is English




Ask ChatGPT about the research

We consider estimating the parameters of a Gaussian mixture density with a given number of components best representing a given set of weighted samples. We adopt a density interpretation of the samples by viewing them as a discrete Dirac mixture density over a continuous domain with weighted components. Hence, Gaussian mixture fitting is viewed as density re-approximation. In order to speed up computation, an expectation-maximization method is proposed that properly considers not only the sample locations, but also the corresponding weights. It is shown that methods from literature do not treat the weights correctly, resulting in wrong estimates. This is demonstrated with simple counterexamples. The proposed method works in any number of dimensions with the same computational load as standard Gaussian mixture estimators for unweighted samples.



rate research

Read More

70 - Wei Cui , Xu Zhang , 2018
Covariance matrix estimation concerns the problem of estimating the covariance matrix from a collection of samples, which is of extreme importance in many applications. Classical results have shown that $O(n)$ samples are sufficient to accurately estimate the covariance matrix from $n$-dimensional independent Gaussian samples. However, in many practical applications, the received signal samples might be correlated, which makes the classical analysis inapplicable. In this paper, we develop a non-asymptotic analysis for the covariance matrix estimation from correlated Gaussian samples. Our theoretical results show that the error bounds are determined by the signal dimension $n$, the sample size $m$, and the shape parameter of the distribution of the correlated sample covariance matrix. Particularly, when the shape parameter is a class of Toeplitz matrices (which is of great practical interest), $O(n)$ samples are also sufficient to faithfully estimate the covariance matrix from correlated samples. Simulations are provided to verify the correctness of the theoretical results.
We propose a novel exponentially-modified Gaussian (EMG) mixture residual model. The EMG mixture is well suited to model residuals that are contaminated by a distribution with positive support. This is in contrast to commonly used robust residual models, like the Huber loss or $ell_1$, which assume a symmetric contaminating distribution and are otherwise asymptotically biased. We propose an expectation-maximization algorithm to optimize an arbitrary model with respect to the EMG mixture. We apply the approach to linear regression and probabilistic matrix factorization (PMF). We compare against other residual models, including quantile regression. Our numerical experiments demonstrate the strengths of the EMG mixture on both tasks. The PMF model arises from considering spectroscopic data. In particular, we demonstrate the effectiveness of PMF in conjunction with the EMG mixture model on synthetic data and two real-world applications: X-ray diffraction and Raman spectroscopy. We show how our approach is effective in inferring background signals and systematic errors in data arising from these experimental settings, dramatically outperforming existing approaches and revealing the datas physically meaningful components.
How can we train a statistical mixture model on a massive data set? In this work we show how to construct coresets for mixtures of Gaussians. A coreset is a weighted subset of the data, which guarantees that models fitting the coreset also provide a good fit for the original data set. We show that, perhaps surprisingly, Gaussian mixtures admit coresets of size polynomial in dimension and the number of mixture components, while being independent of the data set size. Hence, one can harness computationally intensive algorithms to compute a good approximation on a significantly smaller data set. More importantly, such coresets can be efficiently constructed both in distributed and streaming settings and do not impose restrictions on the data generating process. Our results rely on a novel reduction of statistical estimation to problems in computational geometry and new combinatorial complexity results for mixtures of Gaussians. Empirical evaluation on several real-world datasets suggests that our coreset-based approach enables significant reduction in training-time with negligible approximation error.
425 - Nisar R. Ahmed 2019
This work examines the problem of using finite Gaussian mixtures (GM) probability density functions in recursive Bayesian peer-to-peer decentralized data fusion (DDF). It is shown that algorithms for both exact and approximate GM DDF lead to the same problem of finding a suitable GM approximation to a posterior fusion pdf resulting from the division of a `naive Bayes fusion GM (representing direct combination of possibly dependent information sources) by another non-Gaussian pdf (representing removal of either the actual or estimated `common information between the information sources). The resulting quotient pdf for general GM fusion is naturally a mixture pdf, although the fused mixands are non-Gaussian and are not analytically tractable for recursive Bayesian updates. Parallelizable importance sampling algorithms for both direct local approximation and indirect global approximation of the quotient mixture are developed to find tractable GM approximations to the non-Gaussian `sum of quotients mixtures. Practical application examples for multi-platform static target search and maneuverable range-based target tracking demonstrate the higher fidelity of the resulting approximations compared to existing GM DDF techniques, as well as their favorable computational features.
The problem of multimodal clustering arises whenever the data are gathered with several physically different sensors. Observations from different modalities are not necessarily aligned in the sense there there is no obvious way to associate or to compare them in some common space. A solution may consist in considering multiple clustering tasks independently for each modality. The main difficulty with such an approach is to guarantee that the unimodal clusterings are mutually consistent. In this paper we show that multimodal clustering can be addressed within a novel framework, namely conjugate mixture models. These models exploit the explicit transformations that are often available between an unobserved parameter space (objects) and each one of the observation spaces (sensors). We formulate the problem as a likelihood maximization task and we derive the associated conjugate expectation-maximization algorithm. The convergence properties of the proposed algorithm are thoroughly investigated. Several local/global optimization techniques are proposed in order to increase its convergence speed. Two initialization strategies are proposed and compared. A consistent model-selection criterion is proposed. The algorithm and its variants are tested and evaluated within the task of 3D localization of several speakers using both auditory and visual data.
comments
Fetching comments Fetching comments
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا