ﻻ يوجد ملخص باللغة العربية
We consider the problem of estimating the support size of a discrete distribution whose minimum non-zero mass is at least $ frac{1}{k}$. Under the independent sampling model, we show that the sample complexity, i.e., the minimal sample size to achieve an additive error of $epsilon k$ with probability at least 0.1 is within universal constant factors of $ frac{k}{log k}log^2frac{1}{epsilon} $, which improves the state-of-the-art result of $ frac{k}{epsilon^2 log k} $ in cite{VV13}. Similar characterization of the minimax risk is also obtained. Our procedure is a linear estimator based on the Chebyshev polynomial and its approximation-theoretic properties, which can be evaluated in $O(n+log^2 k)$ time and attains the sample complexity within a factor of six asymptotically. The superiority of the proposed estimator in terms of accuracy, computational efficiency and scalability is demonstrated in a variety of synthetic and real datasets.
We give a new framework for proving the existence of low-degree, polynomial approximators for Boolean functions with respect to broad classes of non-product distributions. Our proofs use techniques related to the classical moment problem and deviate
We introduce estimation and test procedures through divergence minimiza- tion for models satisfying linear constraints with unknown parameter. These procedures extend the empirical likelihood (EL) method and share common features with generalized emp
We study the least squares estimator in the residual variance estimation context. We show that the mean squared differences of paired observations are asymptotically normally distributed. We further establish that, by regressing the mean squared diff
We derive independence tests by means of dependence measures thresholding in a semiparametric context. Precisely, estimates of phi-mutual informations, associated to phi-divergences between a joint distribution and the product distribution of its mar
The Method of Moments [Pea94] is one of the most widely used methods in statistics for parameter estimation, by means of solving the system of equations that match the population and estimated moments. However, in practice and especially for the impo