بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

PAC Mode Estimation using PPR Martingale Confidence Sequences

203 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Shubham Anand Jain

تاريخ النشر 2021

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Shubham Anand Jain - Sanit Gupta - Denil Mehta

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of correctly identifying the mode of a discrete distribution $mathcal{P}$ with sufficiently high probability by observing a sequence of i.i.d. samples drawn according to $mathcal{P}$. This problem reduces to the estimation of a single parameter when $mathcal{P}$ has a support set of size $K = 2$. Noting the efficiency of prior-posterior-ratio (PPR) martingale confidence sequences for handling this special case, we propose a generalisation to mode estimation, in which $mathcal{P}$ may take $K geq 2$ values. We observe that the one-versus-one principle yields a more efficient generalisation than the one-versus-rest alternative. Our resulting stopping rule, denoted PPR-ME, is optimal in its sample complexity up to a logarithmic factor. Moreover, PPR-ME empirically outperforms several other competing approaches for mode estimation. We demonstrate the gains offered by PPR-ME in two practical applications: (1) sample-based forecasting of the winner in indirect election systems, and (2) efficient verification of smart contracts in permissionless blockchains.

قيم البحث

اقرأ أيضاً

Martingale Posterior Distributions

162 - Edwin Fong , Chris Holmes , Stephen G. Walker 2021

The prior distribution on parameters of a likelihood is the usual starting point for Bayesian uncertainty quantification. In this paper, we present a different perspective. Given a finite data sample $Y_{1:n}$ of size $n$ from an infinite population, we focus on the missing $Y_{n+1:infty}$ as the source of statistical uncertainty, with the parameter of interest being known precisely given $Y_{1:infty}$. We argue that the foundation of Bayesian inference is to assign a predictive distribution on $Y_{n+1:infty}$ conditional on $Y_{1:n}$, which then induces a distribution on the parameter of interest. Demonstrating an application of martingales, Doob shows that choosing the Bayesian predictive distribution returns the conventional posterior as the distribution of the parameter. Taking this as our cue, we relax the predictive machine, avoiding the need for the predictive to be derived solely from the usual prior to posterior to predictive density formula. We introduce the martingale posterior distribution, which returns Bayesian uncertainty directly on any statistic of interest without the need for the likelihood and prior, and this distribution can be sampled through a computational scheme we name predictive resampling. To that end, we introduce new predictive methodologies for multivariate density estimation, regression and classification that build upon recent work on bivariate copulas.

المنهجية نظرية الإحصاء نظرية الإحصاء

Off-policy Confidence Sequences

512 - Nikos Karampatziakis , Paul Mineiro , Aaditya Ramdas 2021

We develop confidence bounds that hold uniformly over time for off-policy evaluation in the contextual bandit setting. These confidence sequences are based on recent ideas from martingale analysis and are non-asymptotic, non-parametric, and valid at arbitrary stopping times. We provide algorithms for computing these confidence sequences that strike a good balance between computational and statistical efficiency. We empirically demonstrate the tightness of our approach in terms of failure probability and width and apply it to the gated deployment problem of safely upgrading a production contextual bandit system.

التعلم الآلي نظرية الإحصاء التعلم الالي

Simultaneous confidence intervals for the population cell means, for two-by-two factorial data, that utilize uncertain prior information

462 - Paul Kabaila , Khageswor Giri 2012

Consider a two-by-two factorial experiment with more than 1 replicate. Suppose that we have uncertain prior information that the two-factor interaction is zero. We describe new simultaneous frequentist confidence intervals for the 4 population cell m eans, with simultaneous confidence coefficient 1-alpha, that utilize this prior information in the following sense. These simultaneous confidence intervals define a cube with expected volume that (a) is relatively small when the two-factor interaction is zero and (b) has maximum value that is not too large. Also, these intervals coincide with the standard simultaneous confidence intervals obtained by Tukeys method, with simultaneous confidence coefficient 1-alpha, when the data strongly contradict the prior information that the two-factor interaction is zero. We illustrate the application of these new simultaneous confidence intervals to a real data set.

المنهجية نظرية الإحصاء تطبيقات الإحصاء

Network Tomography: Identifiability and Fourier Domain Estimation

636 - Aiyou Chen , Jin Cao , 2007

The statistical problem for network tomography is to infer the distribution of $mathbf{X}$, with mutually independent components, from a measurement model $mathbf{Y}=Amathbf{X}$, where $A$ is a given binary matrix representing the routing topology of a network under consideration. The challenge is that the dimension of $mathbf{X}$ is much larger than that of $mathbf{Y}$ and thus the problem is often called ill-posed. This paper studies some statistical aspects of network tomography. We first address the identifiability issue and prove that the $mathbf{X}$ distribution is identifiable up to a shift parameter under mild conditions. We then use a mixture model of characteristic functions to derive a fast algorithm for estimating the distribution of $mathbf{X}$ based on the General method of Moments. Through extensive model simulation and real Internet trace driven simulation, the proposed approach is shown to be favorable comparing to previous methods using simple discretization for inferring link delays in a heterogeneous network.

المنهجية نظرية الإحصاء تطبيقات الإحصاء

Recycled Least Squares Estimation in Nonlinear Regression

87 - Ben Boukai , Yue Zhang 2018

We consider a resampling scheme for parameters estimates in nonlinear regression models. We provide an estimation procedure which recycles, via random weighting, the relevant parameters estimates to construct consistent estimates of the sampling dist ribution of the various estimates. We establish the asymptotic normality of the resampled estimates and demonstrate the applicability of the recycling approach in a small simulation study and via example.

المنهجية نظرية الإحصاء تطبيقات الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة المستنصرية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

PAC Mode Estimation using PPR Martingale Confidence Sequences

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً