Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

PAC Mode Estimation using PPR Martingale Confidence Sequences

203 0 0.0 ( 0 )

Download Cite

Added by Shubham Anand Jain

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Shubham Anand Jain - Sanit Gupta - Denil Mehta

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider the problem of correctly identifying the mode of a discrete distribution $mathcal{P}$ with sufficiently high probability by observing a sequence of i.i.d. samples drawn according to $mathcal{P}$. This problem reduces to the estimation of a single parameter when $mathcal{P}$ has a support set of size $K = 2$. Noting the efficiency of prior-posterior-ratio (PPR) martingale confidence sequences for handling this special case, we propose a generalisation to mode estimation, in which $mathcal{P}$ may take $K geq 2$ values. We observe that the one-versus-one principle yields a more efficient generalisation than the one-versus-rest alternative. Our resulting stopping rule, denoted PPR-ME, is optimal in its sample complexity up to a logarithmic factor. Moreover, PPR-ME empirically outperforms several other competing approaches for mode estimation. We demonstrate the gains offered by PPR-ME in two practical applications: (1) sample-based forecasting of the winner in indirect election systems, and (2) efficient verification of smart contracts in permissionless blockchains.

rate research

Martingale Posterior Distributions

162 - Edwin Fong , Chris Holmes , Stephen G. Walker 2021

The prior distribution on parameters of a likelihood is the usual starting point for Bayesian uncertainty quantification. In this paper, we present a different perspective. Given a finite data sample $Y_{1:n}$ of size $n$ from an infinite population, we focus on the missing $Y_{n+1:infty}$ as the source of statistical uncertainty, with the parameter of interest being known precisely given $Y_{1:infty}$. We argue that the foundation of Bayesian inference is to assign a predictive distribution on $Y_{n+1:infty}$ conditional on $Y_{1:n}$, which then induces a distribution on the parameter of interest. Demonstrating an application of martingales, Doob shows that choosing the Bayesian predictive distribution returns the conventional posterior as the distribution of the parameter. Taking this as our cue, we relax the predictive machine, avoiding the need for the predictive to be derived solely from the usual prior to posterior to predictive density formula. We introduce the martingale posterior distribution, which returns Bayesian uncertainty directly on any statistic of interest without the need for the likelihood and prior, and this distribution can be sampled through a computational scheme we name predictive resampling. To that end, we introduce new predictive methodologies for multivariate density estimation, regression and classification that build upon recent work on bivariate copulas.

Methodology Statistics Theory Statistics Theory

Off-policy Confidence Sequences

512 - Nikos Karampatziakis , Paul Mineiro , Aaditya Ramdas 2021

We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting. These confidence sequences are based on recent ideas from martingale analysis and are non-asymptotic, non-parametric, and valid at arbitrary stopping times. We provide algorithms for computing these confidence sequences that strike a good balance between computational and statistical efficiency. We empirically demonstrate the tightness of our approach in terms of failure probability and width and apply it to the gated deployment problem of safely upgrading a production contextual bandit system.

Machine Learning Statistics Theory Machine Learning

Simultaneous confidence intervals for the population cell means, for two-by-two factorial data, that utilize uncertain prior information

435 - Paul Kabaila , Khageswor Giri 2012

Consider a two-by-two factorial experiment with more than 1 replicate. Suppose that we have uncertain prior information that the two-factor interaction is zero. We describe new simultaneous frequentist confidence intervals for the 4 population cell means, with simultaneous confidence coefficient 1-alpha, that utilize this prior information in the following sense. These simultaneous confidence intervals define a cube with expected volume that (a) is relatively small when the two-factor interaction is zero and (b) has maximum value that is not too large. Also, these intervals coincide with the standard simultaneous confidence intervals obtained by Tukeys method, with simultaneous confidence coefficient 1-alpha, when the data strongly contradict the prior information that the two-factor interaction is zero. We illustrate the application of these new simultaneous confidence intervals to a real data set.

Methodology Statistics Theory Applications

Network Tomography: Identifiability and Fourier Domain Estimation

602 - Aiyou Chen , Jin Cao , 2007

The statistical problem for network tomography is to infer the distribution of $mathbf{X}$, with mutually independent components, from a measurement model $mathbf{Y}=Amathbf{X}$, where $A$ is a given binary matrix representing the routing topology of a network under consideration. The challenge is that the dimension of $mathbf{X}$ is much larger than that of $mathbf{Y}$ and thus the problem is often called ill-posed. This paper studies some statistical aspects of network tomography. We first address the identifiability issue and prove that the $mathbf{X}$ distribution is identifiable up to a shift parameter under mild conditions. We then use a mixture model of characteristic functions to derive a fast algorithm for estimating the distribution of $mathbf{X}$ based on the General method of Moments. Through extensive model simulation and real Internet trace driven simulation, the proposed approach is shown to be favorable comparing to previous methods using simple discretization for inferring link delays in a heterogeneous network.

Methodology Statistics Theory Applications

Recycled Least Squares Estimation in Nonlinear Regression

87 - Ben Boukai , Yue Zhang 2018

We consider a resampling scheme for parameters estimates in nonlinear regression models. We provide an estimation procedure which recycles, via random weighting, the relevant parameters estimates to construct consistent estimates of the sampling distribution of the various estimates. We establish the asymptotic normality of the resampled estimates and demonstrate the applicability of the recycling approach in a small simulation study and via example.

Methodology Statistics Theory Applications

comments

Fetching comments

University of Aleppo

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

PAC Mode Estimation using PPR Martingale Confidence Sequences

Ask ChatGPT about the research

No Arabic abstract

Read More