Research papers, master and doctoral theses published in the field 9

Reduced bias nonparametric lifetime density and hazard estimation

194 - Arthur Berg , Dimitris N Politis , Kagba Suaray 2018

Kernel-based nonparametric hazard rate estimation is considered with a special class of infinite-order kernels that achieves favorable bias and mean square error properties. A fully automatic and adaptive implementation of a density and hazard rate estimator is proposed for randomly right censored data. Careful selection of the bandwidth in the proposed estimators yields estimates that are more efficient in terms of overall mean squared error performance, and in some cases achieves a nearly parametric convergence rate. Additionally, rapidly converging bandwidth estimates are presented for use in second-order kernels to supplement such kernel-based methods in hazard rate estimation. Simulations illustrate the improved accuracy of the proposed estimator against other nonparametric estimators of the density and hazard function. A real data application is also presented on survival data from 13,166 breast carcinoma patients.

Statistics Theory Statistics Theory

Online EM Algorithm for Latent Data Models

181 - Olivier Cappe 2017

In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model.

Computation Machine Learning

On an Auxiliary Function for Log-Density Estimation

167 - Madeleine L. Cule , Lutz Duembgen 2016

In this note we provide explicit expressions and expansions for a special function which appears in nonparametric estimation of log-densities. This function returns the integral of a log-linear function on a simplex of arbitrary dimension. In particular it is used in the R-package LogCondDEAD by Cule et al. (2007).

Computation Statistics Theory Statistics Theory

The statistical analysis of acoustic phonetic data: exploring differences between spoken Romance languages

90 - Davide Pigoli , Pantelis Z. Hadjipantelis , John S. Coleman 2015

The historical and geographical spread from older to more modern languages has long been studied by examining textual changes and in terms of changes in phonetic transcriptions. However, it is more difficult to analyze language change from an acoustic point of view, although this is usually the dominant mode of transmission. We propose a novel analysis approach for acoustic phonetic data, where the aim will be to statistically model the acoustic properties of spoken words. We explore phonetic variation and change using a time-frequency representation, namely the log-spectrograms of speech recordings. We identify time and frequency covariance functions as a feature of the language; in contrast, mean spectrograms depend mostly on the particular word that has been uttered. We build models for the mean and covariances (taking into account the restrictions placed on the statistical analysis of such objects) and use these to define a phonetic transformation that models how an individual speaker would sound in a different language, allowing the exploration of phonetic differences between languages. Finally, we map back these transformations to the domain of sound recordings, allowing us to listen to the output of the statistical analysis. The proposed approach is demonstrated using recordings of the words corresponding to the numbers from one to ten as pronounced by speakers from five different Romance languages.

Applications

Coherent Frameworks for Statistical Inference serving Integrating Decision Support Systems

141 - Jim Q. Smith , Martine J. Barons , Manuele Leonelli 2015

A subjective expected utility policy making centre, managing complex, dynamic systems, needs to draw on the expertise of a variety of disparate panels of experts and integrate this information coherently. To achieve this, diverse supporting probabilistic models need to be networked together, the output of one model providing the input to the next. In this paper we provide a technology for designing an integrating decision support system and to enable the centre to explore and compare the efficiency of different candidate policies. We develop a formal statistical methodology to underpin this tool. In particular, we derive sufficient conditions that ensure inference remains coherent before and after relevant evidence is accommodated into the system. The methodology is illustrated throughout using examples drawn from two decision support systems: one designed for nuclear emergency crisis management and the other to support policy makers in addressing the complex challenges of food poverty in the UK.

Methodology

Multiple Extended Target Tracking with Labelled Random Finite Sets

91 - Michael Beard , Stephan Reuter , Karl Granstrom 2015

Targets that generate multiple measurements at a given instant in time are commonly known as extended targets. These present a challenge for many tracking algorithms, as they violate one of the key assumptions of the standard measurement model. In this paper, a new algorithm is proposed for tracking multiple extended targets in clutter, that is capable of estimating the number of targets, as well the trajectories of their states, comprising the kinematics, measurement rates and extents. The proposed technique is based on modelling the multi-target state as a generalised labelled multi-Bernoulli (GLMB) random finite set (RFS), within which the extended targets are modelled using gamma Gaussian inverse Wishart (GGIW) distributions. A cheaper variant of the algorithm is also proposed, based on the labelled multi-Bernoulli (LMB) filter. The proposed GLMB/LMB-based algorithms are compared with an extended target version of the cardinalised probability hypothesis density (CPHD) filter, and simulation results show that the (G)LMB has improved estimation and tracking performance.

Computation

On the Use of Cauchy Prior Distributions for Bayesian Logistic Regression

170 - Joyee Ghosh , Yingbo Li , Robin Mitra 2015

In logistic regression, separation occurs when a linear combination of the predictors can perfectly classify part or all of the observations in the sample, and as a result, finite maximum likelihood estimates of the regression coefficients do not exist. Gelman et al. (2008) recommended independent Cauchy distributions as default priors for the regression coefficients in logistic regression, even in the case of separation, and reported posterior modes in their analyses. As the mean does not exist for the Cauchy prior, a natural question is whether the posterior means of the regression coefficients exist under separation. We prove theorems that provide necessary and sufficient conditions for the existence of posterior means under independent Cauchy priors for the logit link and a general family of link functions, including the probit link. We also study the existence of posterior means under multivariate Cauchy priors. For full Bayesian inference, we develop a Gibbs sampler based on Polya-Gamma data augmentation to sample from the posterior distribution under independent Student-t priors including Cauchy priors, and provide a companion R package in the supplement. We demonstrate empirically that even when the posterior means of the regression coefficients exist under separation, the magnitude of the posterior samples for Cauchy priors may be unusually large, and the corresponding Gibbs sampler shows extremely slow mixing. While alternative algorithms such as the No-U-Turn Sampler in Stan can greatly improve mixing, in order to resolve the issue of extremely heavy tailed posteriors for Cauchy priors under separation, one would need to consider lighter tailed priors such as normal priors or Student-t priors with degrees of freedom larger than one.

Methodology

Perturbed Iterate Analysis for Asynchronous Stochastic Optimization

212 - Horia Mania , Xinghao Pan , Dimitris Papailiopoulos 2015

We introduce and analyze stochastic optimization methods where the input to each gradient update is perturbed by bounded noise. We show that this framework forms the basis of a unified approach to analyze asynchronous implementations of stochastic optimization algorithms.In this framework, asynchronous stochastic optimization algorithms can be thought of as serial methods operating on noisy inputs. Using our perturbed iterate framework, we provide new analyses of the Hogwild! algorithm and asynchronous stochastic coordinate descent, that are simpler than earlier analyses, remove many assumptions of previous models, and in some cases yield improved upper bounds on the convergence rates. We proceed to apply our framework to develop and analyze KroMagnon: a novel, parallel, sparse stochastic variance-reduced gradient (SVRG) algorithm. We demonstrate experimentally on a 16-core machine that the sparse and parallel version of SVRG is in some cases more than four orders of magnitude faster than the standard SVRG algorithm.

Machine Learning Distributed Parallel and Cluster Computing Data Structures and Algorithms

Selective inference with a randomized response

104 - Xiaoying Tian , Jonathan E. Taylor 2015

Inspired by sample splitting and the reusable holdout introduced in the field of differential privacy, we consider selective inference with a randomized response. We discuss two major advantages of using a randomized response for model selection. First, the selectively valid tests are more powerful after randomized selection. Second, it allows consistent estimation and weak convergence of selective inference procedures. Under independent sampling, we prove a selective (or privatized) central limit theorem that transfers procedures valid under asymptotic normality without selection to their corresponding selective counterparts. This allows selective inference in nonparametric settings. Finally, we propose a framework of inference after combining multiple randomized selection procedures. We focus on the classical asymptotic setting, leaving the interesting high-dimensional asymptotic questions for future work.

Statistics Theory Statistics Theory

Dynamic Matrix Factorization with Priors on Unknown Values

171 - Robin Devooght , Nicolas Kourtellis , Amin Mantrach 2015

Advanced and effective collaborative filtering methods based on explicit feedback assume that unknown ratings do not follow the same model as the observed ones (emph{not missing at random}). In this work, we build on this assumption, and introduce a novel dynamic matrix factorization framework that allows to set an explicit prior on unknown values. When new ratings, users, or items enter the system, we can update the factorization in time independent of the size of data (number of users, items and ratings). Hence, we can quickly recommend items even to very recent users. We test our methods on three large datasets, including two very sparse ones, in static and dynamic conditions. In each case, we outrank state-of-the-art matrix factorization methods that do not use a prior on unknown ratings.

Machine Learning Information Retrieval Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد