Parameter Estimation for Grouped Data Using EM and MCEM Algorithms

93 0 0.0 ( 0 )

Download Cite

Added by Camila P. E. De Souza Dr.

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Zahra A. Shirazi - Jo~ao Pedro A. R. da Silva - Camila P. E. den Souza

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Nowadays, the confidentiality of data and information is of great importance for many companies and organizations. For this reason, they may prefer not to release exact data, but instead to grant researchers access to approximate data. For example, rather than providing the exact income of their clients, they may only provide researchers with grouped data, that is, the number of clients falling in each of a set of non-overlapping income intervals. The challenge is to estimate the mean and variance structure of the hidden ungrouped data based on the observed grouped data. To tackle this problem, this work considers the exact observed data likelihood and applies the Expectation-Maximization (EM) and Monte-Carlo EM (MCEM) algorithms for cases where the hidden data follow a univariate, bivariate, or multivariate normal distribution. The results are then compared with the case of ignoring the grouping and applying regular maximum likelihood. The well-known Galton data and simulated datasets are used to evaluate the properties of the proposed EM and MCEM algorithms.

rate research

MCEM and SAEM Algorithms for Geostatistical Models under Preferential Sampling

53 - Douglas Mateus da Silva , Lourdes C. Contreras Montenegro 2021

The problem of preferential sampling in geostatistics arises when the choise of location to be sampled is made with information about the phenomena in the study. The geostatistical model under preferential sampling deals with this problem, but parameter estimation is challenging because the likelihood function has no closed form. We developed an MCEM and an SAEM algorithm for finding the maximum likelihood estimators of parameters of the model and compared our methodology with the existing ones: Monte Carlo likelihood approximation and Laplace approximation. Simulated studies were realized to assess the quality of the proposed methods and showed good parameter estimation and prediction in preferential sampling. Finally, we illustrate our findings on the well known moss data from Galicia.

Methodology

Active Set and EM Algorithms for Log-Concave Densities Based on Complete and Censored Data

512 - Lutz Duembgen , Andre Huesler , Kaspar Rufibach 2011

Methodology Computation

Online EM for Functional Data

390 - Florian Maire , Eric Moulines , Sidonie Lefebvre 2016

A novel approach to perform unsupervised sequential learning for functional data is proposed. Our goal is to extract reference shapes (referred to as templates) from noisy, deformed and censored realizations of curves and images. Our model generalizes the Bayesian dense deformable template model (Allassonni`ere et al., 2007), a hierarchical model in which the template is the function to be estimated and the deformation is a nuisance, assumed to be random with a known prior distribution. The templates are estimated using a Monte Carlo version of the online Expectation-Maximization algorithm, extending the work from Cappe and Moulines (2009). Our sequential inference framework is significantly more computationally efficient than equivalent batch learning algorithms, especially when the missing data is high-dimensional. Some numerical illustrations on curve registration problem and templates extraction from images are provided to support our findings.

Methodology

Data Reduction in Markov model using EM algorithm

166 - Atanu Kumar Ghosh , Arnab Chakraborty 2018

This paper describes a data reduction technique in case of a markov chain of specified order. Instead of observing all the transitions in a markov chain we record only a few of them and treat the remaining part as missing. The decision about which transitions to be filtered is taken before the observation process starts. Based on the filtered chain we try to estimate the parameters of the markov model using EM algorithm. In the first half of the paper we characterize a class of filtering mechanism for which all the parameters remain identifiable. In the later half we explain methods of estimation and testing about the transition probabilities of the markov chain based on the filtered data. The methods are first developed assuming a simple markov model with each probability of transition positive, but then generalized for models with structural zeroes in the transition probability matrix. Further extension is also done for multiple markov chains. The performance of the developed method of estimation is studied using simulated data along with a real life data.

Methodology

Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes

77 - M. Chung , M. Binois , R.B. Gramacy 2018

Inference on unknown quantities in dynamical systems via observational data is essential for providing meaningful insight, furnishing accurate predictions, enabling robust control, and establishing appropriate designs for future experiments. Merging mathematical theory with empirical measurements in a statistically coherent way is critical and challenges abound, e.g.,: ill-posedness of the parameter estimation problem, proper regularization and incorporation of prior knowledge, and computational limitations on full uncertainty qualification. To address these issues, we propose a new method for learning parameterized dynamical systems from data. In many ways, our proposal turns the canonical framework on its head. We first fit a surrogate stochastic process to observational data, enforcing prior knowledge (e.g., smoothness), and coping with challenging data features like heteroskedasticity, heavy tails and censoring. Then, samples of the stochastic process are used as surrogate data and point estimates are computed via ordinary point estimation methods in a modular fashion. An attractive feature of this approach is that it is fully Bayesian and simultaneously parallelizable. We demonstrate the advantages of our new approach on a predator prey simulation study and on a real world application involving within-host influenza virus infection data paired with a viral kinetic model.

Methodology Numerical Analysis