Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Active Set and EM Algorithms for Log-Concave Densities Based on Complete and Censored Data

568 0 0.0 ( 0 )

Download Cite

Added by Lutz Duembgen

Publication date 2011

fields Mathematical Statistics

and research's language is English

Authors Lutz Duembgen - Andre Huesler - Kaspar Rufibach

Methodology Computation

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

No English abstract

rate research

A Support Vector Machine Based Cure Rate Model For Interval Censored Data

69 - Suvra Pal , Yingwei Peng , Sandip Barui 2021

The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this paper, we propose a new mixture cure rate model based on interval censored data that uses the support vector machine (SVM) to model the effect of covariates on the uncured or the cured probability (i.e., on the incidence part of the model). Our proposed model inherits the features of the SVM and provides flexibility to capture classification boundaries that are non-linear and more complex. Furthermore, the new model can be used to model the effect of covariates on the incidence part when the dimension of covariates is high. The latency part is modeled by a proportional hazards structure. We develop an estimation procedure based on the expectation maximization (EM) algorithm to estimate the cured/uncured probability and the latency model parameters. Our simulation study results show that the proposed model performs better in capturing complex classification boundaries when compared to the existing logistic regression based mixture cure rate model. We also show that our models ability to capture complex classification boundaries improve the estimation results corresponding to the latency parameters. For illustrative purpose, we present our analysis by applying the proposed methodology to an interval censored data on smoking cessation.

Methodology Computation

Log-concave sampling: Metropolis-Hastings algorithms are fast

106 - Raaz Dwivedi , Yuansi Chen , Martin J. Wainwright 2018

We consider the problem of sampling from a strongly log-concave density in $mathbb{R}^d$, and prove a non-asymptotic upper bound on the mixing time of the Metropolis-adjusted Langevin algorithm (MALA). The method draws samples by simulating a Markov chain obtained from the discretization of an appropriate Langevin diffusion, combined with an accept-reject step. Relative to known guarantees for the unadjusted Langevin algorithm (ULA), our bounds show that the use of an accept-reject step in MALA leads to an exponentially improved dependence on the error-tolerance. Concretely, in order to obtain samples with TV error at most $delta$ for a density with condition number $kappa$, we show that MALA requires $mathcal{O} big(kappa d log(1/delta) big)$ steps, as compared to the $mathcal{O} big(kappa^2 d/delta^2 big)$ steps established in past work on ULA. We also demonstrate the gains of MALA over ULA for weakly log-concave densities. Furthermore, we derive mixing time bounds for the Metropolized random walk (MRW) and obtain $mathcal{O}(kappa)$ mixing time slower than MALA. We provide numerical examples that support our theoretical findings, and demonstrate the benefits of Metropolis-Hastings adjustment for Langevin-type sampling algorithms.

Machine Learning Computation

Maximum likelihood estimation of a multidimensional log-concave density

416 - Madeleine Cule , Richard Samworth , Michael Stewart 2008

Let X_1, ..., X_n be independent and identically distributed random vectors with a log-concave (Lebesgue) density f. We first prove that, with probability one, there exists a unique maximum likelihood estimator of f. The use of this estimator is attractive because, unlike kernel density estimation, the method is fully automatic, with no smoothing parameters to choose. Although the existence proof is non-constructive, we are able to reformulate the issue of computation in terms of a non-differentiable convex optimisation problem, and thus combine techniques of computational geometry with Shors r-algorithm to produce a sequence that converges to the maximum likelihood estimate. For the moderate or large sample sizes in our simulations, the maximum likelihood estimator is shown to provide an improvement in performance compared with kernel-based methods, even when we allow the use of a theoretical, optimal fixed bandwidth for the kernel estimator that would not be available in practice. We also present a real data clustering example, which shows that our methodology can be used in conjunction with the Expectation--Maximisation (EM) algorithm to fit finite mixtures of log-concave densities. An R version of the algorithm is available in the package LogConcDEAD -- Log-Concave Density Estimation in Arbitrary Dimensions.

Methodology Computation

Parameter Estimation for Grouped Data Using EM and MCEM Algorithms

92 - Zahra A. Shirazi , Jo~ao Pedro A. R. da Silva , Camila P. E. den Souza 2021

Nowadays, the confidentiality of data and information is of great importance for many companies and organizations. For this reason, they may prefer not to release exact data, but instead to grant researchers access to approximate data. For example, rather than providing the exact income of their clients, they may only provide researchers with grouped data, that is, the number of clients falling in each of a set of non-overlapping income intervals. The challenge is to estimate the mean and variance structure of the hidden ungrouped data based on the observed grouped data. To tackle this problem, this work considers the exact observed data likelihood and applies the Expectation-Maximization (EM) and Monte-Carlo EM (MCEM) algorithms for cases where the hidden data follow a univariate, bivariate, or multivariate normal distribution. The results are then compared with the case of ignoring the grouping and applying regular maximum likelihood. The well-known Galton data and simulated datasets are used to evaluate the properties of the proposed EM and MCEM algorithms.

Methodology

A kernel log-rank test of independence for right-censored data

65 - Tamara Fernandez , Arthur Gretton , David Rindt 2019

We introduce a general non-parametric independence test between right-censored survival times and covariates, which may be multivariate. Our test statistic has a dual interpretation, first in terms of the supremum of a potentially infinite collection of weight-indexed log-rank tests, with weight functions belonging to a reproducing kernel Hilbert space (RKHS) of functions; and second, as the norm of the difference of embeddings of certain finite measures into the RKHS, similar to the Hilbert-Schmidt Independence Criterion (HSIC) test-statistic. We study the asymptotic properties of the test, finding sufficient conditions to ensure our test correctly rejects the null hypothesis under any alternative. The test statistic can be computed straightforwardly, and the rejection threshold is obtained via an asymptotically consistent Wild Bootstrap procedure. Extensive simulations demonstrate that our testing procedure generally performs better than competing approaches in detecting complex non-linear dependence.

Methodology Machine Learning

comments

Fetching comments

Al-Andalus University for Medical Sciences

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Active Set and EM Algorithms for Log-Concave Densities Based on Complete and Censored Data

Ask ChatGPT about the research

No Arabic abstract

No English abstract

Read More