بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Sample complexity of the distinct elements problem

71 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Pengkun Yang

تاريخ النشر 2016

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Yihong Wu - Pengkun Yang

نظرية الإحصاء نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the distinct elements problem, where the goal is to estimate the number of distinct colors in an urn containing $ k $ balls based on $n$ samples drawn with replacements. Based on discrete polynomial approximation and interpolation, we propose an estimator with additive error guarantee that achieves the optimal sample complexity within $O(loglog k)$ factors, and in fact within constant factors for most cases. The estimator can be computed in $O(n)$ time for an accurate estimation. The result also applies to sampling without replacement provided the sample size is a vanishing fraction of the urn size. One of the key auxiliary results is a sharp bound on the minimum singular values of a real rectangular Vandermonde matrix, which might be of independent interest.

قيم البحث

383 - Alexandre Belloni , Gustavo Didier 2008

In this paper we provide a provably convergent algorithm for the multivariate Gaussian Maximum Likelihood version of the Behrens--Fisher Problem. Our work builds upon a formulation of the log-likelihood function proposed by Buot and Richards citeBR. Instead of focusing on the first order optimality conditions, the algorithm aims directly for the maximization of the log-likelihood function itself to achieve a global solution. Convergence proof and complexity estimates are provided for the algorithm. Computational experiments illustrate the applicability of such methods to high-dimensional data. We also discuss how to extend the proposed methodology to a broader class of problems. We establish a systematic algebraic relation between the Wald, Likelihood Ratio and Lagrangian Multiplier Test ($Wgeq mathit{LR}geq mathit{LM}$) in the context of the Behrens--Fisher Problem. Moreover, we use our algorithm to computationally investigate the finite-sample size and power of the Wald, Likelihood Ratio and Lagrange Multiplier Tests, which previously were only available through asymptotic results. The methods developed here are applicable to much higher dimensional settings than the ones available in the literature. This allows us to better capture the role of high dimensionality on the actual size and power of the tests for finite samples.

نظرية الإحصاء نظرية الإحصاء

An approximate randomization test for high-dimensional two-sample Behrens-Fisher problem under arbitrary covariances

129 - Rui Wang , Wangli Xu 2021

This paper is concerned with the problem of comparing the population means of two groups of independent observations. An approximate randomization test procedure based on the test statistic of Chen & Qin (2010) is proposed. The asymptotic behavior of the test statistic as well as the randomized statistic is studied under weak conditions. In our theoretical framework, observations are not assumed to be identically distributed even within groups. No condition on the eigenstructure of the covariance is imposed. And the sample sizes of two groups are allowed to be unbalanced. Under general conditions, all possible asymptotic distributions of the test statistic are obtained. We derive the asymptotic level and local power of the proposed test procedure. Our theoretical results show that the proposed test procedure can adapt to all possible asymptotic distributions of the test statistic and always has correct test level asymptotically. Also, the proposed test procedure has good power behavior. Our numerical experiments show that the proposed test procedure has favorable performance compared with several altervative test procedures.

نظرية الإحصاء نظرية الإحصاء

Sample complexity of hidden subgroup problem

142 - Zekun Ye , Lvzhou Li 2021

The hidden subgroup problem ($mathsf{HSP}$) has been attracting much attention in quantum computing, since several well-known quantum algorithms including Shor algorithm can be described in a uniform framework as quantum methods to address different instances of it. One of the central issues about $mathsf{HSP}$ is to characterize its quantum/classical complexity. For example, from the viewpoint of learning theory, sample complexity is a crucial concept. However, while the quantum sample complexity of the problem has been studied, a full characterization of the classical sample complexity of $mathsf{HSP}$ seems to be absent, which will thus be the topic in this paper. $mathsf{HSP}$ over a finite group is defined as follows: For a finite group $G$ and a finite set $V$, given a function $f:G to V$ and the promise that for any $x, y in G, f(x) = f(xy)$ iff $y in H$ for a subgroup $H in mathcal{H}$, where $mathcal{H}$ is a set of candidate subgroups of $G$, the goal is to identify $H$. Our contributions are as follows: For $mathsf{HSP}$, we give the upper and lower bounds on the sample complexity of $mathsf{HSP}$. Furthermore, we have applied the result to obtain the sample complexity of some concrete instances of hidden subgroup problem. Particularly, we discuss generalized Simons problem ($mathsf{GSP}$), a special case of $mathsf{HSP}$, and show that the sample complexity of $mathsf{GSP}$ is $Thetaleft(maxleft{k,sqrt{kcdot p^{n-k}}right}right)$. Thus we obtain a complete characterization of the sample complexity of $mathsf{GSP}$.

التعقيد الحسابي

The Generalized Lasso Problem and Uniqueness

99 - Alnur Ali , Ryan J. Tibshirani 2018

We study uniqueness in the generalized lasso problem, where the penalty is the $ell_1$ norm of a matrix $D$ times the coefficient vector. We derive a broad result on uniqueness that places weak assumptions on the predictor matrix $X$ and penalty matr ix $D$; the implication is that, if $D$ is fixed and its null space is not too large (the dimension of its null space is at most the number of samples), and $X$ and response vector $y$ jointly follow an absolutely continuous distribution, then the generalized lasso problem has a unique solution almost surely, regardless of the number of predictors relative to the number of samples. This effectively generalizes previous uniqueness results for the lasso problem (which corresponds to the special case $D=I$). Further, we extend our study to the case in which the loss is given by the negative log-likelihood from a generalized linear model. In addition to uniqueness results, we derive results on the local stability of generalized lasso solutions that might be of interest in their own right.

نظرية الإحصاء نظرية الإحصاء

The exact distribution of the sample variance from bounded continuous random variables

490 - T. Royen 2008

For a sample of absolutely bounded i.i.d. random variables with a continuous density the cumulative distribution function of the sample variance is represented by a univariate integral over a Fourier series. If the density is a polynomial or a trigon ometrical polynomial the coefficients of this series are simple finite terms containing only the error function, the exponential function and powers. In more general cases - e.g. for all beta densities - the coefficients are given by some series expansions. The method is generalized to positive semi-definite quadratic forms of bounded independent but not necessarily identically distributed random variables if the form matrix differs from a diagonal matrix D > 0 only by a matrix of rank 1

نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

معھد الشام العالي للعلوم الشرعية واللغة العربية والدراسات والبحوث الإسلامية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Sample complexity of the distinct elements problem

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً